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INTRODUCTION 


1  Introduction 

In  textbooks,  or  in  explanations  given  by  experienced  engineers  and  mathe¬ 
maticians,  we  often  encounter  the  phrase  “by  inspection  the  solution  is  . . . 
This  paper  begins  to  develop  an  account  of  the  role  of  inspection  methods  m 
engineering  problem  solving  generally,  and  in  programming  specifically.  An 
important  motivation  underlying  this  work  is  the  belief  that,  in  order  to  lur- 
ther  automate  the  programming  process,  we  must  have  bettei  computational 
models  of  the  problem  solving  methods  used  by  programmers. 

The  outline  of  the  paper  is  as  follows.  In  Section  1,  engineering  prob¬ 
lem  solving  is  introduced  as  a  domain  of  study  and  is  compared  with  other 
problem  solving  domains.  Within  the  engineering  context,  two  very  different 
kinds  of  problem  solving  method  are  contrasted:  inspection  methods  and 
uniform  general  methods. 

In  Section  2,  the  concept  of  inspection  methods  in  programming  is  de¬ 
veloped  in  detail  via  an  extended  scenario  of  analysis  by  inspection.  This 
section  also  includes  short  examples  of  synthesis  by  inspection  and  valida¬ 
tion  by  inspection,  which  illustrate  the  shared  knowledge  (cliches)  underlying 
inspection  methods. 

Section  3  defines  a  formalism,  called  the  Plan  Calculus,  which  is  used  to 
codify  the  knowledge  underlying  inspection  methods  in  programming  in  a 
convenient,  canonical,  and  programming-language  independent  fashion. 

Section  4  concludes  the  paper  with  a  discussion  of  the  relationship  of 
the  Plan  Calculus  to  programming  languages  and  other  formalisms,  current 
limitations  of  the  Plan  Calculus,  and  further  work. 

A  companion  paper  [39j  describes  an  initial  library  of  common  program 
forms,  which  has  been  compiled  using  the  Plan  Calculus,  and  its  use  in 
automated  systems  for  analysis  and  synthesis  of  programs. 

1.1  Engineering  Problem  Solving 

Programming  is  viewed  here  as  a  kind  of  engineering  activity.  7'his  is  tlie 
appropriate  view  for  understanding  the  programming  involved  in  the  devel¬ 
opment  large  software  systems.'  In  this  context,  a  first  question  to  ask  is: 
What  properties  do  different  kinds  of  engineering  have  in  common? 

'The  other  major  school  of  thought  is  to  view  programming  as  a  kind  of  mathematical 
activity,  which  is  more  appropriate  for  understanding  the  development  of  algorithms. 
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The  first  common  property  of  engineering  domains  is  the  existence  of  a 
set  of  standardized,  well-understood,  primitive  building  blocks.  For  exam¬ 
ple,  in  electrical  engineering  all  circuits  are  made  up  at  the  lowest  level  of 
resistances,  capacitances,  inductances,  and  so  on.  Similarly,  in  mechanical 
engineering,  all  devices  eventually  come  down  to  the  primitive  mechanisms  of 
lever,  gear,  rod,  pulley,  and  so  on.  In  software  engineering,  all  programs  can 
be  constructed  out  of  assignment,  conditional,  and  recursion.  This  feature 
of  engineering  domains  distinguishes  them  from  many  other  problem  solving 
domains  studied  in  AI  (for  example,  medical  diagnosis)  in  which  there  is  no 
well-established  primitive  level  of  description. 

A  second  common  property  of  engineering  domains  is  that  the  central 
problem  can  be  posed  abstractly  as  follows:  Given  the  vocabulary  of  primi¬ 
tives  and  the  rules  for  their  legitimate  combination,  devise  a  composite  (usu¬ 
ally  hierarchical)  structure  which  has  some  desired  behavior.  This  character¬ 
ization  of  engineering  problems  distinguishes  them  from  other  AI  problenns 
(for  example,  playing  chess)  in  which  the  relationship  between  structure  and 
function  is  not  the  central  concern. 

Ill  addition  to  the  central  synthesis  problem,  engineers  also  need  to  be 
able  to  analyze  a  device  (i.e.,  to  infer  properties  of  its  behavior  from  its 
structure),  and  incrementally  modify  (debug)  the  structure  of  a  device  in 
order  to  achieve  a  desired  modification  in  behavior. 

In  summary,  engineering  problem  solving  is  concerned  with  the  analysis, 
synthesis  and  debugging  of  hierarchical  objects  constructed  for  an  explicit 
purpose. 


1.2  Uniform  General  Methods 

Two  quite  different  approaches  have  evolved  for  solving  engineering  problems. 
One  approach,  which  I  call  uniform  general  methods,  takes  advantage  of 
the  fact  that  the  primitive  elements  of  the  domain  have  well-understood 
behav  iors.  For  example,  in  electrical  engineering,  one  way  to  determine  the 
frequency  response  of  a  linear  circuit  is  to  solve  a  set  of  equations  derived  from 
the  topology  of  the  circuit,  viewed  primitively  as  a  network  of  resistances, 
capacitances  and  inductances. 

Similarly  in  mechanical  engineering,  one  way  to  analyze  the  stresses  and 
strains  in  a  mechanical  structure  is  by  the  so-called  “finite  element  method.” 
This  method  also  comes  down  to  solving  (usually  by  computer)  a  set  of 
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4  INTRODUCTION 

equations  derived  by  viewing  the  mechanical  structure  as  a  grid  of  primitive 
geometric  elements  that  interact  in  simple  ways. 

Programming  also  has  its  uniform  general  methods.  For  example,  the 
Floyd-Hoare  approach  [18,  24]  to  program  verification  starts  with  the  seman¬ 
tics  of  the  programming  language  primitives  and  combines  them  arrordine 
to  the  structure  of  the  program  to  derive  a  single  large  theorem  to  be  proved 
(again,  usually  by  computer). 

Uniform  general  methods,  such  as  these  examples,  have  sever.il  attractive 
properties.  First,  they  are  based  on  firm  mathematical  foundations.  As  a 
result,  their  domain  of  applicability  is  well-defined — you  know  when  they  will 
work  and  when  they  will  not.  Second,  the  solution  process  is  algorithmic, 
and  thus  amenable  to  conventional  computerization. 

Despite  these  attractive  features,  the  surpising  fact  is  that  experienced  en¬ 
gineers  typically  use  uniform  general  methods  only  as  a  last  resort.  The  rea¬ 
son  for  this  is  that  these  methods  typically  return  only  an  answer.  They  yield 
little  insight  into  what  the  engineer  is  ultimately  concerned  with,  namely  the 
detailed  relationship  between  structure  and  function  in  the  device  under  anal¬ 
ysis.  The  engineer  needs  to  understand  this  relationship  in  order  to  modify 
the  structure  of  the  device — for  example,  to  bring  it  closer  to  achieving  its 
desired  function. 

Unfortunately,  in  real  engineering  applications  (including  programming), 
a  detailed  description  of  how  the  behavior  of  a  composite  device  follows  from 
the  interaction  of  the  behaviors  of  its  primitive  components  is  extremely  com¬ 
plex.  In  response  to  this  complexity,  engineering  communities  have  evolved 
intermediate  vocabularies,  giving  names  to  those  few  out  of  all  possible  com¬ 
binations  of  primitives  that  have  been  useful  in  practice.  The  next  section 
discusses  the  kind  of  problem  solving  which  takes  place  in  an  engineering 
environment  that  is  enriched  with  this  kind  of  knowledge. 


1.3  Inspection  Methods 

Suppose  you  present  an  electrical  engineer  with  a  circuit  and  ask  him  to 
answer  a  question  about  its  behavior,  such  ^ls:  What  is  the  gain  (ratio  be¬ 
tween  the  strength  of  the  output  signal  and  the  strength  of  the  input  signal)? 
One  way  of  answering  this  question  is  to  employ  a  uniform  general  method, 
namely  to  methodically  translate  the  structure  of  the  circuit  into  a  corre¬ 
sponding  set  of  equations,  which  can  then  be  solved  to  obtain  the  answer. 
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This  is  not,  however,  the  kind  of  analysis  method  you  are  most  likely 
to  elicit  from  an  experienced  engineer.  If  the  given  circuit  is  designed  in 
accordance  with  routine  engineering  practice,  an  experienced  engineer  will 
first  recognize  the  circuit.  For  example,  he  may  say  “this  is  a  two-stage 
audio  amplifier.”  Given  this  recognition,  the  task  of  answering  the  posed 
question  is  greatly  simplified.  For  example,  in  the  crise  of  a  two-stage  audio 
amplifer,  tlie  engineer  knows  immediately  that  the  gain  may  be  computed 
from  the  product  of  the  ratios  of  a  certain  pairs  of  resistors  at  key  points  in  the 
circuit.  In  electrical  engineering,  answering  questions  about  a  circuit  by  first 
recognizing  its  form  is  called  analysis  by  inspection.  Only  if  you  intentionally 
concoct  an  obscure  circuit,  can  you  force  an  experienced  engineer  to  resort 
to  setting  up  equations. 

Similarly  in  programming,  suppose  you  present  an  experienced  program¬ 
mer  with  a  large  data  processing  system,  and  enquire  as  to  its  maximum 
running  time  for  given  size  inputs.  Rather  than  resorting  to  the  first  princi¬ 
ples  of  complexity  analysis,  the  experienced  programmer  will  first  recognize 
which  of  the  standard  algorithms  for  searching,  sorting,  etc.  are  being  em¬ 
ployed  and  then  use  their  known  properties  to  compute  the  desired  property 
of  the  net  behavior. 

There  is  also  synthesis  by  inspection.  For  example,  faced  with  the  task 
of  implementing  a  common  electrical  function,  such  as  a  high-gain,  low- 
impcdance  amplifier,  the  hallmark  of  an  experienced  electrical  engineer  is  his 
ability  to  retrieve  from  his  mental  (or  actual)  “cook  book”  an  appropriate 
first-cut  design  (which  he  may  subsequently  modify  and  refine).*  Similarly, 
faced  with  the  task  of  implementing  a  common  programming  behavior,  such 
as  associative  retrieval,  the  hallmark  of  an  experienced  programmer  is  his 
ability  to  call  to  mind  a  repertoire  of  appropriate  standard  techniques,  such 
as  hashing,  discrimination  nets,  or  property  lists. 

I  call  these  engineering  problem  solving  methods,  beised  on  the  recogni¬ 
tion  and  use  of  standard  forms,  inspection  methods;  I  call  the  standard  forms 
cliches.  Examples  of  cliches  in  the  domain  of  circuits  include  voltage  divider, 
emitter-coupled  pair,  and  Schmidt  trigger.  Examples  of  cliche  in  the  do¬ 
main  of  programs  include  bubble  sort,  doubly-linked  list,  and  linear  search. 
Cliches  form  the  shared  technical  vocabulary  of  a  discipline.  Although  the 
word  cliche  has  a  negative  connotation  when  used  in  the  context  of  literary 
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criticism,  in  engineering,  the  repeated  use  of  the  same  “forms  of  expression" 
is  desirable.  Reuse  improves  productivity  in  the  design  process,  as  well  as 
the  understandability  (and  thus  maintainability)  of  the  resulting  devices. 

A  crucial  part  of  any  computational  account  of  problem  solving  in  an  en¬ 
gineering  domain  is  therefore  a  representation  for  the  cliches  in  that  domain. 
In  order  to  motivate  the  representation  for  programming  cliches  introduced 
in  Section  3,  Section  2  illustrates  the  properties  and  use  of  programming 
cliches  via  several  examples. 

Notions  similar  to  the  cliche  idea  appear  in  software  engineering  in  the 
work  of  Arango  and  Freeman  [3]  (domain  models),  Harandi  and  Young  [22] 
(design  templates),  and  Lavi  [29]  (generic  models);  and  in  artificial  intelli¬ 
gence  in  the  work  of  Minsky  [35,  36]  (frames,  concept  germs),  Schank  [50j 
(scripts),  and  Chapman  [11]  (cognitive  cliches). 


2  Inspection  Methods  in  Programming 

The  two  goals  of  this  section  are  to  deepen  the  reader’s  understanding  of 
what  is  meant  by  cliches  in  programming  and  to  motivate  the  representation 
for  programming  clichfe  defined  in  Section  3.  To  achieve  these  goals,  this 
section  presents  an  informal  but  detailed  scenario  of  program  analysis  by 
inspection. 

Solving  an  analysis  problem  in  the  context  of  programming  amounts  to 
deriving  some  non-obvious  properties  of  a  program.  To  illustrate  the  role  of 
cliches  in  this  process,  let  us  put  ourselves  into  the  following  not-so-imaginary 
situation. 

Suppose  you  are  part  of  the  maintenance  team  for  a  large  software  system. 
V’ou  have  been  assigned  a  system  enhancement  task  which  requires  the  use  of 
a  hash  table.  In  the  utilities  portion  of  the  system  sources,  you  find  the  code 
shown  in  Figure  1.  Unfortunately,  as  you  begin  to  use  this  implementation  of 
hash  tables  in  your  application,  you  realize  that  the  documentation  doesn’t 
answer  an  important  question:  How  does  this  implementation  handle  dupli¬ 
cate  keys?  More  specifically:  If  you  call  TABLE-INSERT  with  an  entry  whose 
key  might  already  be  in  the  table,  do  you  first  have  to  call  TABLE-DELETE  to 
delete  the  old  entry?  (Perhaps,  in  the  original  application,  duplicate  keys 
never  occurred,  so  the  implementor  didn’t  think  to  document  what  the  be¬ 
havior  was  under  these  conditions.) 

As  a  straw  man,  you  might  consider  solving  this  analysis  problem  by 
formulating  it  as  a  theorem — something  along  the  lines  of  proving  that  for 
any  table  t  and  entry  e, 

table-dcletc  {tablc-insert{t,e),key{e))  =  t. 

If  a  theorem  like  this  is  true,  then  you  can  feel  free  to  add  and  delete  entries 
without  worrying  about  duplicates.  If  it  is  not  true,  however,  you  need  to 
understand  how  the  proof  fails  so  that  you  know  what  aspects  of  the  behavior 
of  TABLE-INSERT  and  TABLE-DELETE  you  can  rely  upon. 

More  likely,  if  you  are  an  experienced  programmer,  you  will  take  the 
approach  of  first  studying  the  code  to  discover  what  cliches  were  used — what 
is  sometimes  called  “reverse  engineering” — and  then  answering  the  question 
of  interest  bcised  on  your  understanding  of  the  design.  In  this  example,  you 
know  from  experience  that  there  are  basically  two  ways  to  handle  duplicate 
entries  in  any  aggregate  structure:  either  you  check  for  duplicates  at  insertion 
time  or  you  search  for  duplicates  at  deletion  time.  The  question  then  boils 
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(DEFUN  TABLE- LOOKUP  (TABLE  KEY) 

(LET  ((BUCKET  (AREF  TABLE  (HASH  KEY  TABLE)))) 

(LOOP 

(IF  (NULL  BUCKET)  (RETURN  NIL)) 

(LET  ((ENTRY  (CAR  BUCKET))) 

(IF  (EQUAL  (KEY  ENTRY)  KEY)  (RETURN  ENTRY))) 

(SETQ  BUCKET  (CDR  BUCKET))))) 

(DEFUN  TABLE- INSERT  (TABLE  ENTRY) 

(PUSH  ENTRY  (AREF  TABLE  (HASH  (KEY  ENTRY)  TABLE))) 

TABLE) 

(DEFUN  TABLE-DELETE  (TABLE  KEY) 

(LET*  ((INDEX  (HASH  KEY  TABLE)) 

(BUCKET  (AREF  TABLE  INDEX))) 

(IF  (EQUAL  (KEY  (CAR  BUCKET))  KEY) 

(SETF  (AREF  TABLE  INDEX)  (CDR  BUCKET)) 

(BUCKET- DELETE  BUCKET  KEY))) 

TABLE) 

(DEFUN  BUCKET-DELETE  (BUCKET  KEY) 

(LET  ((PREVIOUS  BUCKET)) 

(LOOP 

(SETQ  BUCKET  (CDR  PREVIOUS)) 

(IF  (NULL  BUCKET)  (RETURN  NIL)) 

(WHEN  (EQUAL  (KEY  (CAR  BUCKET))  KEY) 

(RPLACD  PREVIOUS  (CDDR  PREVIOUS)) 

(RETURN  NIL)) 

(SETQ  PREVIOUS  BUCKET)))) 

Fi^ur6  1.  The  Common  Lisp  functions  above  implement  a  hash  table.  Note 
that  the  BASH  function  is  not  defined  here;  assume  it  is  just  a  numeric  al  formula 
t^hich,  although  it  may  also  be  a  cliche,  is  not  the  topic  of  this  example.  The  KEY 
function  simply  extracts  some  field  from  an  entry.  There  should  also  be  a  function 
for  making  a  new  table. 


I 

I 


PROGRAM  ANALYSIS  BY  INSPECTION 


9 


(DEFUN  TABLE-LOOKUP  (TABLE  KEY) 

(LET  ((BUCKET  (AREF  TABLE  (HASH  KEY  TABLE)))) 
(LOOP 


(IF  (NULL  BUCKET)  (RETURN  NIL)) 


(LET  ((ENTRY  (CAR  BUCKET))) 


(IF  (EQUAL  (KEY  ENTRY)  KEY)  (RETURN  ENTRY))) 


linear 

search 


(SETQ  BUCKET  (CDR  BUCKET))))) 

Figure  2.  Recognition  of  linear  search  cliche. 

down  to  recognizing  which  (if  either)  of  these  two  decisions  was  made  in 
tlie  code.  Note  also  that  by  taking  the  approach  of  understanding  the  code 
completely  first,  you  will  be  in  a  good  position  to  modify  the  program  to  fit 
your  current  application,  if  necessary. 

Let  us  now  proceed  step  by  step  through  an  introspective  account  of  rec¬ 
ognizing  the  cliche  in  the  code  in  Figure  1.  As  well  as  introducing  further 
examples  of  cliches  in  programming,  this  scenario  also  illustrates  some  im¬ 
portant  structural  aspects  of  programming  clichfe  which  must  be  addressed 
in  the  formal  representation. 

2.1  Program  Analysis  by  Inspection 

We  begin  with  the  first  function  in  Figure  1,  TABLE-LOOKUP.  This  function  is 
essentially  a  loop.  A  key  feature  of  a  loop  is  the  number  and  form  of  its  exit 
conditions.  The  loop  in  TABLE-LOOKUP  has  two  exits  as  indicated  in  Figure  2. 
More  specifically,  this  is  an  instance  of  the  linear  search  cliche: 

A  linear  search  is  a  loop  in  which  a  given  predicate  (the  same 
one  each  time)  is  applied  to  a  succession  of  values  (in  this  case, 
the  values  of  the  variable  ENTRY)  until  either:  a  value  is  found 
which  satisfies  the  predicate,  in  which  case  the  search  is  termi¬ 
nated  and  the  value  satisfying  the  predicate  is  made  available 
outside  the  loop  (in  this  case  via  (RETURN  ENTRY));  or  there  are 
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(DEFUN  TABLE-LOOKUP  (TABLE  KEY) 

(LET  ((BUCKET  (AREF  TABLE  (HASH  KEY  TABLE)))) 
(LOOP 

(IF  (NULL  BUCKET)  (RETURN  NIL)) 

(LET  ((ENTRY  (CAR  BUCKET))) 

enumera.tion  (jp  (equal  (KEY  ENTRY)  KEY)  (RETURN  ENTRY))) 


(SETQ  BUCKET  (CDR  BUCKET))))) 


Figure  3.  Recognition  of  list  enumeration  cliche. 


(DEFUN  TABLE-LOOKUP  (TABLE  KEY) 


(LET  ((BUCKET  (AREF  TABLE  (HASH  KEY  TABLE)))) 


(LOOP 


j(IF  (NULL  BUCKET)  (RETURN  NIL)) 


(LET  ((ENTRY  (CAR  BUCKET))) 


n 

~1  linear 
I  search 


enumention  (IF  (EQUAL  (KEY  ENTRY)  KEY)  (RETURN  ENTRY))) 


(SETQ  BUCKET  (CDR  BUCKET))))) 


Figure  4.  Overlapping  occurrences  of  cliches. 
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no  more  values,  in  which  case  the  search  is  terminated  with  a 
failure  indication  (in  this  case,  by  returning  NIL). 

Figure  3  indicates  that  TABLE-LOOKUP  also  contains  an  occurrence  of  one 
of  the  most  familiar  Lisp  programming  cliches,  namely  the  CAR,  CDR,  NULL 
jiattcrn  of  list  enumera.tion.  Note  that  “pattern”  in  this  context  does  not 
mean  a  particular  configuration  of  the  program  string  or  parse  tree,  but  rather 
a  particular  set  of  operations  connected  by  the  appropriate  data  and  control 
flow.  In  the  case  of  list  enumeration,  for  example,  the  input  to  the  NULL  test 
must  be  the  same  as  the  input  to  the  CAR  and  the  CDR;  and  control  must  exit 
the  loop  when  the  NULL  test  succeeds.  The  formal  representation  defined  in 
Section  3  supports  this  notion  of  pattern  in  the  definition  of  programming 
cliches. 

Another  important  aspect  of  programming  cliches  illustrated  in  TABLE- 
LOOKUP  is  the  fact  that  occurrences  of  cliches  can  overlap.  Figure  4  shows 
tlie  superposition  of  the  linear  search  and  list  enumeration  cliche  recognized 
above.  Notice  that  the  NULL  exit  test  fills  two  roles:  it  is  the  failure  exit  of  the 
linear  search  and  also  the  empty- list  test  of  the  list  enumeration.  This  way 
of  decomposing  programs  violates  the  strictly  hierarchical  approach  of  most 
current  programming  methodologies.  We  will  see  several  examples,  however, 
in  which  overlapping  decomposition  is  necessary  in  order  to  recognize  all  the 
cliches  in  a  program. 

The  code  for  TABLE-INSERT  is  only  one  line  long.  The  only  cliche  in 
TABLE-INSERT  has  already  migrated  into  the  programming  language:  The 
PUSH  macro  in  Lisp  captures  the  cliched  use  of  CONS  to  add  an  element  onto 
the  front  of  a  list,  as  in 

(SETQ  L  (CONS  ...  D)  . 

(.Seel  ion  4  discusses  the  relationship  between  cliches  and  programming  lan- 
guages.) 

.Moving  on  to  TABLE-DELETE  (Figure  5),  we  see  that  the  body  of  this 
function  is  a  conditional  which  checks  for  a  common  special  case  that  comes 
up  in  the  implementation  of  destructive  deletion  operations,  namely  deleting 
the  clement  at  the  head  of  the  data  structure.^  The  form  of  this  cliche,  which 
might  be  called  special  case  bead  deletion,  is  as  follows: 

^Failure  to  clieck  for  this  special  case  leads  to  a  characteristic  bug.  For  a  furtlier 
<lisrti>sioti  of  hug  cliches  a  topic  not  pursued  in  this  paper- -.see  [51]. 
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(DEFUM  BUCKET-DELETE  (BUCKET  KEY) 

(LET  ((PREVIOUS  BUCKET)) 

(LOOP 

(SETQ  BUCKET  (CDR  PREVIOUS)) 

(IF  (NULL  BUCKET)  (RETURN  NIL)) 

(WHEN  (EQUAL  (KEY  (CAR  BUCKET))  KEY) 

(RPLACD  PREVIOUS  (CDDR  PREVIOUS))  search 
(RETURN  NIL)) 

(SETQ  PREVIOUS  BUCKET)))) 

Figure  5.  Recognition  of  linear  search  cliche. 


(DEFUN  BUCKET-DELETE  (BUCKET  KEY) 
(LET  ((PREVIOUS  BUCKET)) 

(LOOP 


trailing 

pointer 

list 

enumeration 


(SETQ  BUCKET  (CDR  PREVIOUS)) 

(IF  (NULL  BUCKET)  (RETURN  NIL)) 


(WHEN  (EQUAL  (KEY 


(CAR  BUCKET) b  KEY) 


(RPLACD  PREVIOUS  (CDDR  PREVIOUS)) 
(RETURN  NIL)) 


(SETQ  PREVIOUS  BUCKET)))) 


i 


Figure  6.  Recognition  of  trailing  pointer  list  enumeration  rlirhe. 
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(DEFUN  TABLE-DELETE  (TABLE  KEY) 
(LET*  ((INDEX  (HASH  KEY  TABLE)) 


special  case 
head  deletion 


(BUCKET  (AREF  TABLE  INDEX))) 

(IF  (EQUAL  (KEY  (CAR  BUCKET))  KEY) 

(SETF  (AREF  TABLE  INDEX)  (CDR  BUCKET)) 
(BUCKET-DELETE  BUCKET  KEY))) 


TABLE) 

Figure  7.  Recognition  of  special  case  head  deletion  cliche. 

If  the  head  of  the  data  structure  (in  this  case,  the  CAR  of  the  list 
BUCKET)  satisfies  the  criterion  for  deletion  (in  this  case,  its  KEY  is 
equal  to  the  given  key),  then  update  all  pointers  to  the  head  of 
the  structure  to  point  instead  to  the  tail  of  the  structure  (in  this 
case  the  CDR  of  the  list).  Otherwise,  if  the  head  of  the  structure 
is  not  to  be  deleted,  use  a  deletion  by  side-effect  operation  which 
works  for  ‘'internai”  (non-head)  elements. 

Tliis  example  illustrates,  among  other  things,  that  data  abstraction  needs 
to  be  a  part  of  the  formalization  of  programming  cliche,  since  one  wants  to 
refer  abstractly  in  the  cliche  above  to  the  “head”  and  “tail”  of  a  structure, 
separate  from  particular  implementations  (such  as  the  CAR  and  CDR  of  a  Lisp 
list). 

Moving  on  to  BUCKET-DELETE  (Figure  6),  note  that  this  function  also  con¬ 
tains  a  linear  search.  The  syntax  in  this  case  is  very  different  from  the  linear 
search  in  Figure  2.  However,  the  data  and  control  flow  relationships  between 
the  two  search  exits  are  the  same. 

BUCKET-DELETE  also  has  instances  of  the  CAR,  CDR,  and  NULL  operations 
with  data  and  control  flow  between  them  satisfying  the  constraints  of  the  list 
enumeration  cliche  (see  Figure  7).  Again,  although  the  syntax  of  this  occur¬ 
rence  of  list  enumeration  is  very  different  from  the  syntax  in  TABLE-LOOKUP, 
we  recognize  the  same  cliche.  Note  that  this  occurrence  of  the  cliche  has  an 
additional  l)it  of  structure,  which  is  a  common  extension  of  list  enumeration. 
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(DEFUN  BUCKET-DELETE  (BUCKET  KEY) 
(LET  ((PREVIOUS  BUCKET)) 
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namely  a  “trailing  pointer.”  The  control  and  data  flow  in  this  loop  is  ar¬ 
ranged  so  that  on  each  iteration  there  is  a  pointer  (in  the  variable  PREVIOUS) 
to  the  cell  in  the  list  whose  CDR  is  the  current  cell  being  enumerated  (in  the 
variable  BUCKET).  This  extension  of  list  enumeration,  which  might  be  called 
trailing  pointer  list  enumeration  is  most  commonly  used  (as  is  the  case  here) 
in  connection  with  destructive  deletion  operations. 

This  example  illustrates  that  programming  knowledge  includes  not  only 
cliches,  but  also  relationships  between  them.  Extension  is  one  of  a  number  of 
different  relationships  between  cliches  which  are  supported  by  the  formalism 
described  in  the  next  section. 

Finally,  note  in  Figure  S  that  the  occurrences  of  the  list  enumeration  and 
linear  search  cliches  in  BUCKET-DELETE  overlap  in  a  similar  manner  to  those 
in  TABLE-LOOKUP. 

A  final  cliche  that  can  be  recognized  in  BUCKET-DELETE  is  splice  out,  as 
shown  in  Figure  9.  The  arbitrary  use  of  side  effects  like  RPLACA  and  RPLACD 
can  lead  to  extremely  hard- to- understand  code.  In  this  case,  however,  RPLACD 
is  being  used  in  a  very  specific  context:  its  second  argument  is  the  current 
pointer  of  a  list  enumeration  and  its  first  argument  is  the  corresponding 
trailing  pointer.  This  use  of  RPLACD  removes  the  current  element  from  the 
enumerated  list  (by  side  effect).  A  cliche  like  splice  out  is  an  example  of  how 
the  recognition  of  cliche  can  bypass  intractable  general-case  reasoning. 

Using  the  Results  of  the  Analysis 

Now  that  you  have  finished  analyzing  the  program,  you  are  in  a  position 
to  answer  the  original  question  quite  e^lsily:  This  implementation  does  not 
handle  duplicate  keys  at  all,  because  there  is  no  checking  for  duplicate  keys 
at  insertion  time  (TABLE-INSERT  just  does  a  push)  or  at  deletion  time  (the 
linear  search  cliche  used  in  TABLE-DELETE  stops  after  finding  the  first  value 
satisfying  the  criterion).  Therefore,  you  do  have  to  call  TABLE-DELETE  before 
each  call  to  TABLE-INSERT  in  which  the  entry  might  have  a  duplicate  key. 

Furthermore,  with  this  detailed  understanding  of  the  relationship  between 
the  structure  and  function  of  the  program,  you  are  abo  in  a  good  position  to 
modify  the  program,  if  desired.  For  example,  suppose  you  decide  to  handle 
duplicate  keys  at  deletion  time.  There  are  two  changes  you  need  to  make  to 
the  program. 

First,  you  need  to  replace  the  linear  search  cliche  used  in  BUCKET-DELETE 
by  a  rclatc'd  cliche,  exhaustive  linear  search,  which  doesn’t  stop  after  find- 
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ing  the  first  value  satisfying  the  criterion,  but  rather  searches  for  all  values 
satisfying  the  criterion.  The  splice  out  action  is  then  applied  to  each  entry 
found  by  the  search. 

Second,  because  there  could  be  several  duplicate  keys  at  the  licau  of  a 
bucket,  the  special  case  head  deletion  cliche  in  TABLE-DELETE  needs  lx* 
replaced  by  an  exhaustive  linear  search,  in  which  the  head  deletion  action 
(the  SETF)  is  applied  to  each  case  found.  (As  a  code  compression,  this  loop 
could  be  combined  with  the  loop  in  BUCKET-DELETE.) 

Viewing  the  hash  table  program  as  the  composition  of  clichik  like  linear 
search,  splice  out,  and  so  on,  these  changes  are  modular — a  matter  of  adding 
or  replacing  a  small  number  of  conceptual  parts — even  though  this  may  result 
in  many  scattered  changes  at  the  code  level. 

2.2  Program  Synthesis  by  Inspection 

The  notion  of  recognizing  familiar  forms  applies  not  only  to  analysis,  but  also 
to  the  synthesis  of  programs.  For  example,  consider  synthesizing  a  program 
to  satisfy  the  following  specification;  Given  a  set  6  and  a  key  k,  return  a 
value  c,  such  that 

(e  €  6  A  key{e)  =  k)  V  (e  =  nil  A  'ix  €.b[key{x)  ^  k]). 

A  well-known  uniform  general  method  for  program  synthesis  is  to  treat 
such  a  specification  as  a  theorem  (literally,  'ibk3e  . . . ).  If  this  theorem  can 
be  proved  using  constructive  proof  techniques  only,  then  the  resulting  proof 
is  essentially  a  program  which  satisfies  the  specification. 

More  likely,  if  you  are  an  experienced  programmer,  you  will  recognize  that 
this  specification  is  not  some  arbitrary  formula  in  first  order  logic,  but  rather 
an  instance  of  a  common  specification  cliche,  w'hich  might  be  called  fine/  if 
present:  Given  an  aggregate  data  structure,  find  an  element  satisfying  .some 
criterion;  or  if  there  is  none,  return  a  distinguished  value.  From  c.xperience, 
this  specification  suggests  the  combination  of  an  enumeration  with  a  linear 
search  cliche,  as  in  the  following  code. 

(LET  ((BUCKET  ...)) 

(LOOP 

(IF  (empty  BUCKET)  (RETURN  NIL)) 

(LET  ((ENTRY  (first  BUCKET))) 

(IF  (criterion  ENTRY)  (RETURN  ENTRY))) 

(SETQ  BUCKET  (rest  BUCKET)))) 


•  M  V  ‘jl  "Ji  •  •  '  »  • 
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Enumeration  is  an  abstract  cliche  comprised  of  the  above  pattern  of  data 
and  control  flow  between  operations  on  an  abstract  data  type  that  supports 
the  operations  of  selecting  the  first  element  (first),  computing  an  aggregate 
with  all  but  the  first  element  (rest),  and  testing  for  empty  (empty).  The 
linear  search  cliche,  discussed  earlier,  is  comprised  of  the  pattern  of  data  and 
control  flow  associated  with  the  criterion  and  empty  tests  above. 

The  next  step  in  the  synthesis  is  to  fill  in  the  criterion  role  of  the  linear 
searcli  with  the  code  for  testing  the  criterion  of  the  specification  (key{e)  =  k). 

(LET  ((BUCKET  ...)) 

(LOOP 

(IF  (empty  BUCKET)  (RETURN  NIL)) 

(LET  ((ENTRY  (first  BUCKET))) 

(IF  (EQUAL  (KEY  ENTRY)  KEY)  (RETURN  ENTRY))) 

(SETQ  BUCKET  (rest  BUCKET)))) 

To  obtain  the  code  for  the  loop  of  TABLE- LOOKUP  in  Figure  1,  the  final 
decision  to  be  made  is  to  implement  buckets  as  Lisp  lists.  This  amounts  to 
filling  in  CAR  for  first,  CDR  for  rest  and  NULL  for  empty. 

(LET  ((BUCKET  ...)) 

(LOOP 

(IF  (NULL  BUCKET)  (RETURN  NIL)) 

(LET  ((ENTRY  (CAR  BUCKET))) 

(IF  (EQUAL  (KEY  ENTRY)  KEY)  (RETURN  ENTRY))) 

(SETQ  BUCKET  (CDR  BUCKET)))) 

This  example  of  synthesis  by  inspection  brings  out  several  additional 
points  regarding  the  formalization  of  cliche.  First,  we  see  that  there  are 
standard  forms  of  specifications  as  well  as  programs.  This  suggests  a  wide- 
spectrum  language,  so  that  the  same  approach  can  be  applied  to  both  speci¬ 
fication  and  program  constructs.  Second,  we  have  seen  examples  of  two  more 
kinds  of  relationships  between  cliches,  namely  implementation  (enumeration 
and  linear  search  can  be  used  to  implement  the  find  if  present  cliche),  and 
specialization  (list  enumeration  is  a  specialization  of  enumeration). 
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2.3  Program  Validation  by  Inspection 

Program  validation  is  concerned  with  making  sure  that  programs  do  wlial 
they  are  supposed  to,  or  conversely,  getting  rid  of  errors. 

A  uniform  general  method  for  program  validation  is  program  vcrifii  at  i<  >" 
In  this  approach,  a  formal  proof  is  constructed  to  guarantee  that  a  program 
satisfies  a  given  formal  specification  (e.g.,  a  set  of  preconditions  and  po.''t 
conditions).  T1  i  is  done  by  combined  the  specification  with  the  axioms  for 
each  language  p  imitive  in  the  program,  yielding  a  single  formula/ theorem 
to  be  proved.  This  formula  can  then  be  passed  to  a  general  purpose  theorem 
prover.  If  the  theorem  is  true,  then  the  program  satisfies  the  specilication. 
Often,  however,  the  theorem  is  not  true,  which  means  that  the  program  has 
an  error.  Tnfortunately,  when  this  happens,  the  theorem  prover  is  not  in  a 
position  to  give  you  much  advice  as  to  where  in  the  program  the  error  might 
be  or  how  to  fix  it. 

Although  the  ultimate  goal  of  program  verification  is  to  confirm  that  a 
given  program  is  “correct”  with  respect  to  some  specification,  most  of  the 
verification  process  is  actually  spent  dealing  with  progranxs  that  are  not 
yet  correct.  What  is  needed,  therefore,  is  a  complementary  approach  more 
oriented  towards  diagnosing  errors  in  terms  of  the  structure  of  the  program, 
so  that  the  programmer  has  some  hint  how  to  proceed.  One  such  approach  is 
a  kind  of  inspection  method  that  might  be  called  near-miss  cliche  recognition. 

Near-miss  cliche  recognition  is  based  on  the  idea  of  near-miss  pattern 
matching,  cis  used  by  Winston  [64]  and  others.  In  near-miss  recognition,  a 
cliche  is  recognized  when  most  but  not  all  of  its  required  elements  are  present . 
To  illustrate,  consider  the  following  buggy  version  of  the  hash  tal)le  bucket 
deletion  function. 

(DEFUM  BUCKET-DELETE  (BUCKET  KEY  4AUX  PREVIOUS) 

(LOOP 

(IF  (NULL  BUCKET)  (RETURN  NIL)) 

(WHEN  (EQUAL  (KEY  (CAR  BUCKET))  KEY) 

(RPLACD  PREVIOUS  (CDDR  PREVIOUS)) 

(RETURN  NIL)) 

(SETQ  PREVIOUS  BUCKET) 

(SETQ  BUCKET  (CDR  PREVIOUS)))) 

Under  certain  input  data  conditions,  this  function  will  cause  exe.  ution  to 
be  interrupted  with  an  error  report  sometiiing  like  the  following 
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(DEFUN  BUCKET-DELETE  (BUCKET  KEY  &AUX  PREVIOUS) 


(LOOP 


/lear-mjss 


(IF  (NULL  BUCKET)  (RETURN  NIL)) 


trailing  (vmEN  (EQUAL  (KEY  (CAR  BUCKET))  KEY) 

pointer  - 

(RPLACD  PREVIOUS  (CDDR  PREVIOUS)) 

enumeration  (RETURN  NIL)) 

(SETQ  PREVIOUS  BUCKET) 

(SETQ  BUCKET  (CDR  PREVIOUS)))) 


Figure  10.  Near-miss  recognition  of  trailing  pointer  list  enumeration  cliche. 

♦♦♦ERROR***  RPLACD  -  NIL  INVALID  ARGUMENT. 

Applying  near-miss  cliche  recognition  to  this  function  (see  Figure  10)  re¬ 
veals  a  near-miss  occurrence  of  the  trailing  pointer  list  enumeration  cliche.  In 
this  definition  of  BUCKET-DELETE,  the  appropriate  list  enumeration  operations 
are  present  with  the  appropriate  relationships  betw’een  them,  and  there  is  a 
trailing  pointer  (in  the  variable  PREVIOUS)  whose  CDR  is  the  current  cell  being 
enumerated  (in  the  variable  BUCKET),  except  on  the  first  iteration.  Based  on 
this  recognition,  the  following  helpful  diagnostic  message  could  be  produced: 

It  looks  like  you  are  trying  to  implement  a  trailing  pointer  list  enu¬ 
meration  of  BUCKET,  with  PREVIOUS  as  the  trailing  pointer.  Note, 
however,  that  on  the  first  iteration  of  the  enumeration,  the  CDR 
of  PREVIOUS  is  not  guaranteed  to  he  equal  to  BUCKET. 

There  arc,  of  course,  many  ways  of  modifying  the  program  to  fix  this  bug 
(a  correct  version  is  shown  in  Figure?).  What  this  example  illustrates  is  that 
the  same  knowledge  of  cliches  can  be  used  in  many  parts  of  the  programming 
process. 

Ni^ar-miss  cliche  recognition  is  obviously  not  a  complete  approach  to  val¬ 
idation.  It  doe-s  not  guarantee  that  a  program  does  what  you  want,  but  only 
that  it  does  not  have  a  certain  class  of  structural  flaws.  On  the  other  hand, 
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this  approach  does  not  require  you  to  provide  a  formal  specification,  whirli 
in  many  instances  is  at  least  as  hard  to  write  as  the  program  U'clf  Fnrtlici 
more,  near-miss  cliche  recognition,  when  it  works,  provides  the  programmer 
with  a  very  germane  characterization  of  the  error. 

An  interesting  line  of  research,  which  is  being  pursued  by  Wil!®  fo  <  . 
to  develop  distance  metrics  which  distinguish  near-misses  that  are  useiul 
diagnostics,  from  those  that  are  so  far  away  as  to  be  irrelevant. 

This  discussion  of  validation  introduces  another  desideratum  fc>r  the  rej)- 
resentation  of  programming  clichfe.  Since  near-miss  cliclie  recognition  is  not 
a  complete  approach,  it  is  desirable  to  provide  a  formal  semantics  for  the 
representation  of  cliche  that  will  make  it  possible  to  apply  a  combination 
of  inspection  methods  and  more  general,  theorem-proving  methods  to  vali¬ 
dating  programs.  For  example,  the  synthesis  by  inspection  scenario  in  the 
preceding  section  suggests  a  verification  approach  in  which  a  proof  structure 
is  built  in  parallel  with  the  synthesis  steps  by  combining  pre-proved  lcmma.s 
(associated  with  the  cliche),  using  general  theorem-proving  as  the  “glnc." 
This  hybrid  approach  is  currently  being  pursued  in  a  system  by  Feldman  and 
Rich  [44]. 
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3  The  Plan  Calculus 

Formalizing  the  notion  of  inspection  methods  introduced  in  Sections  1  and  2 
has  two  steps.  The  first  step  is  to  define  a  representation  language  for  pro¬ 
gramming  cliches.  This  representation  language,  called  the  Plan  Calculus, 
is  the  topic  of  this  section.  The  second  step  is  to  use  the  Plan  Calculus  to 
codify  a  library  of  specific  cliches.  An  initial  library  of  cliches  for  the  routine 
manipulation  of  symbolic  data  is  described  in  a  separate  paper  [39]. 

3.1  Desired  Properties  of  the  Representation 

A  reader  of  the  scenarios  in  Section  2  might  be  left  with  the  impression 
that  a  programming  cliche  could  be  represented  most  directly  as  some  frag¬ 
ment  of  program  text,  perhaps  with  holes  in  it.  Although  this  is  an  effective 
expository  technique,  program  text  or  schemas  lack  several  important  prop¬ 
erties  that  are  desired  in  a  knowledge  representation  for  cliche,  especially 
for  the  purpose  of  building  automated  programming  tools.  Three  important 
properties  that  templates  and  schemas  lack  are; 

•  Canonical  Form 

•  Convenient  Manipulation 

•  Language  Independence 

A  discussion  of  these  properties,  and  why  program  text  or  schemas  lack  them, 
serves  as  a  good  introduction  and  motivation  for  the  Plan  Calculus. 

The  first  property  which  program  text  or  schemas  lack  is  canonical  form. 
Consider  the  linear  search  cliche  as  an  example.  The  idea  of  a  linear  search 
could  be  expressed  informally  in  English  as  something  like  the  following. 


A  linear  search  is  a  loop  in  which  a  given  predicate  (the  same  one 
each  time)  is  applied  to  a  succession  of  values  until  either  a  value 
is  found  which  satisfies  the  predicate,  in  which  case  that  value  is 
made  available  outside  the  search;  or  there  are  no  more  values, 
in  which  case  the  search  is  terminated  with  a  failure  indication. 


In  building  a  library  of  cliches,  we  would  like  there  to  be  a  unique  formal 
structure  representing  this  concept.  Unfortunately,  in  Lisp  and  most  other 
programming  languages,  this  kind  of  computation  can  be  written  in  many 
diircrent  forms,  such  as: 


•V*'. 


VJVwV.^^'A^^V.V.VJV.■V.’J'.■AV.V■^^■ 
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(LOOP 

(IF  exhausted  (RETURN  NIL)) 

(IF  (predicate  current)  (RETURN  current)) 

) 

Or  using  PROG  with  only  one  RETURN,  instead  of  two: 


(PROG  () 

LP  (COND  (exhausted  NIL) 

(T  ... 

(IF  (predicate  current) 
(RETURN  current)) 

(GO  LP)))) 


Or  even  tail  recursively: 

(DEFUN  SEARCH  (...) 

(COND  (exhausted  NIL) 

(T  ... 

(COND  ((predicate  current)  current) 
(T  ... 

(SEARCH  ...)))))) 


The  problem  here  is  choosing  which  vereion  to  use.  Viewed  formally 
as  abstract  syntax  trees  in  the  grammar  of  the  programming  language,  tlie 
different  versions  above  have  very  different  structures.  Yet,  considering  the 
semantics  of  the  programming  language,  all  three  versions  specify  esseniially 
the  same  algorithm,  i.e.,  the  same  set  of  computations  with  the  same  data 
and  control  relationships  between  them."* 

The  Plan  Calculus  remedies  this  problem  by  representing  data  and  contn^i 
flow  structure  explicitly.  For  example,  all  three  of  the  schemas  above  (and 
many  other  such  variations)  are  canonicalized  to  the  single  reiuc'Sf'iitation 

■^Some  readers  may  feel  that  the  tail  recursive  version  is  fiindanioiilally  (lineri  iii.  How¬ 
ever,  recent  implementations  of  Lisp  treat  loops  and  tail  recursion  as  alternate  slyhsiir 
expressions  of  iteration,  i.e.,  tail  recursion  is  executed  without  accumulating  stark  depth. 
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shown  in  Figure  11.®  A  programming  cliche  represented  in  the  Plan  (’alculus. 
such  as  Figure  11,  is  called  a  plan. 

The  notation  used  in  drawing  diagrams  of  plans  is  desciiP;-.!  in  detail 
below.  Note  for  the  moment  that  the  formahsm  takes  its  inspiiation  fiuin 
the  kind  of  diagrams  that  programmers  often  scrawl  on  blackboards  and  i!i  ‘ 
backs  of  envelopes  when  in  discussion  with  other  programmers.  .A  pian  is 
essentially  a  hierarchical  graph  structure  made  up  of  different  kimls  of  boxes 
and  arrows.  The  inner  rectangular  boxes  denote  operations  and  tests,  while 
the  arrows  between  boxes  denote  data  flow  (solid  arrows)  and  control  flow 
(solid  arrows  with  double  cross-hatch  marks). 

A  second  desired  property  which  program  text  or  schemas  lac  k  is  conve 
nient  manipulation.  As  anyone  who  has  ever  written  a  complicated  mat  lo 
package  can  attest,  operations  on  program  text,  such  as  concatentation  and 
substitution,  are  in  general  a  quite  tricky  business.  Typical  problems  in¬ 
clude  unintended  interactions  due  to  accidental  duplication  of  identifiers  and 
awkward  constructions  due  to  mismatch  of  syntactic  forms.  Moreover,  ma¬ 
nipulations  which  are  conceptually  simple  from  an  algorithmic  point  of  view 
often  correspond  to  inconvenient  transformations  at  the  program  text  lewd. 
For  example,  consider  combining  a  cliche  of  the  form 

(A  (B  ...)  (C  ...)  ...) 
with  another  cliche  of  the  form 

(F  (G  ...)  (H  ...)) 

such  that  the  output  of  G  is  used  as  the  third  input  to  A. 

Operating  on  these  cliches  in  the  program  text  form  shown  above,  this 
combination  is  achieved  by  a  complicated  secjuencc  of  rearrangements  re.siilt 
ing  in  code  something  like  the  following: 

(LET  ((X  (G  ...))) 

(A  (B  ...)  (C  ...)  X) 

(F  X  (H  ...))) 

In  the  Plan  Calculus  the  combination  of  these  two  clit  he.s.  each  exi)iessed 
as  a  plan,  is  a  matter  of  adding  only  the  single  data  How  arc  shown  l.v  tin' 

®A  program  which  automatically  peforms  this  caiioiiicalization  ha.-  h.a  a  iniplcnn  iii,  ,| 
by  Waters  [61]. 
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Figure  12.  Cotnbining  two  plans  by  adding  a  data  flow  arc. 

bold  line  in  Figure  12.  This  illustratc.s  how  the  Plan  Calculus  is  a  represen¬ 
tation  in  which  the  operations  that  typically  occur  in  the  application  task 
(namely,  manipulating  tlie  algorithmic  content  of  programs)  have  a  more 
dir('ci  (■orresi)ondencc  with  the  operations  which  are  naturally  supported  by 
the  syntax  of  the  representation  (namely,  addition,  deletion  and  modification 
of  arcs  and  nodes  in  a  directed  graph). 

The  use  of  data  flow  in  the  Plan  Calculus  also  reduces  the  complexity  of 
reasoning  about  programs  by  eliminating  a  lot  of  spurious  side  effects.  In  a 
conventional  programming  language  semantics,  every  assignment  statement 
is  a  side  effect.  Most  assignment  statements,  however,  are  not  inherently 
interesting  state  changes,  but  rather  are  i)art  of  a  pattern  of  variable  assign¬ 
ment. s  and  references  used  to  move  data  from  its  point  of  production  to  its 
point(s)  of  use;.  The  Plan  Calculus  models  this  use  of  variables  explicitly  as 
data  flow  arcs. 

For  examicle.  from  a  programming- language  point  of  view,  the  following 
code  in\ulve.s  a  side  elfec  t  (tcj  the  varialcle  X); 

(SETQ  X  (P  A)) 

(Q  X) 


The  corresponding  plan,  iu.wever.  has  no  side  effects  in  it — the  use  of  the 
\aria  de  X  coric'sponds  to  a  data  flow  arc  from  P  to  Q.  (Side  effects  in  the 
Plan  Calcidus  are  discussed  further  in  Section  2.6.) 

1  h<’  third,  and  most  obvious.  pro])erty  that  program  text  or  schemas  lack 
is  language  independence.  This  is  a  jcroblem  for  two  reasons.  First,  from 
a  prac  tical  point  of  view,  the  compilation  of  libraries  of  cliches  to  support 


automated  programming  tools  is  likely  to  be  an  expensive  process.  \vhos<> 
cost  will  need  to  be  amortized  over  as  broad  use  as  possible.  separate  li 
brary  for  each  programming  language  makes  this  amortization  more  difTicult . 
Second,  from  a  theoretical  point  of  view,  common  experience  irlLs  us  that  if 
a  programmer  knows  how  to  write  a  cliche  like  hash  table,  linear  search,  ..r 
bubble  sort  in  Lisp,  he  also  knows  how  to  do  it  in  other  languages  lu  winch 
he  is  fluent. 

The  relationship  between  the  Plan  Calculus  and  programming  lauguaj;,es 
is  discussed  further  in  Section  4.  Modules  have  been  implemented  to  t  ranslate 
between  the  Plan  Calculus  and  an  assortment  of  programming  languages  [1-1. 
59,  61]. 

An  additional  desideratum  for  the  representation  of  cliches  is  that  the 
formalism  be  neutral  between  analysis  and  synthesis.  This  turns  out  to 
be  of  practical  importance  in  building  interactive  programming  aids,  since 
in  practice  these  two  activities  are  intermingled.  A  neutral  representation 
of  cliches  is  also  theoretically  more  attractive  than  a  representation  tailored 
specifically  for  analysis  or  synthesis  only,  since  is  em  a  priori  a  siinjiler  account 
of  the  phenomena. 

3.2  Plans 

The  choice  of  the  term  plan  for  the  knowledge  representation  used  in  this 
work  is  motivated  from  two  directions.  One  sense  of  the  teim  is  taken  from 
viewing  programming  as  a  kind  of  engineering  activity.  Other  enginem  ing 
disciplines  have  developed  specialized  schematic  languages  for  representing 
the  structure  and  function  of  devices  and  partial  designs.  For  example,  an 
electrical  engineer  uses  circuit  diagrams  and  block  diagrams  at  various  levels 
of  abstraction;  a  structural  engineer  uses  large-scale  and  detailed  blue  prints 
which  show  both  the  architectural  framework  of  a  building  aiul  also  various 
subsystems  such  as  heating,  wiring  and  plumbing;  a  meclumical  engineer 
uses  overlapping  hierarchical  descriptions  of  the  interconnections  betwec'n 
mechanical  parts  and  assemblies.  In  this  sense,  the  Plan  Calculus  is  inteiRh'il 
to  serve  as  a  “blueprint  language”  for  programs.  .Mso,  as  in  otln-r  engineering 
disciplines,  the  same  language  is  used  to  describe  both  specific  devic  es  and 
the  cliche  out  of  which  these  devices  are  commonly  built. 

A  fundamental  characteristic  shared  by  all  these  types  cT  eiigiin'ering 
plans  is  that  at  each  level  there  is  a  set  of  parts  with  constraints  betwc'en 
them.  Sometimes  these  parts  correspond  to  discrete  physical  < ompoin'iits. 


such  as  transistors  in  a  circuit  diagram.  More  often,  though,  the  decomposi¬ 
tion  is  in  terms  of  function.  For  e.xample,  a  simple  amplifier  in  an  electrical 
block  diagram  has  the  functional  description  V2  =  kVi,  where  Vi  and  V2  are 
the  input  and  output  signals,  and  k  is  the  amplification  factor.  As  far  as  this 
level  of  plan  is  concerned,  the  amplification  may  be  realized  in  any  number  of 
ways.  A  primitive  comiJonent  may  be  used  or  another  plan  may  be  provided 
which  decomposes  the  amplifier  further.  By  analogy,  plans  in  programming 
siiecii'y  the  parts  of  a  computation  and  constraints  between  them. 

Another  sense  of  the  term  plan  is  taken  from  the  planning  subfield  of 
.'\1.  '1  he  goal  of  a  planning  algorithm  is  to  find  a  sequence  of  actions  which 
transforms  a  given  initial  state  of  the  world  into  a  desired  final  state.  This 
problem  is  analogous  to  the  synthesis  of  straight-line  programs.  In  early 
planning  work  (e.g.,  [17]),  a  plan  was  represented  simply  as  a  sequence  of 
actions.  Sacerdoti  [49],  howe%'er,  showed  that  it  was  much  more  efficient  to 
use  a  partially-ordered  set  of  actions  as  the  basic  representation.  In  this 
representation,  the  planning  algorithm  needs  to  consider  only  those  ordering 
constraints  which  are  actually  required  by  the  current  set  of  actions,  rather 
than  forcing  an  arbitrary  total  ordering.  By  analogy,  a  set  of  operations 
in  tile  Plan  Calculus  may  be  partially  ordered  by  control  and  data  flow,  as 
opposed  to  operations  in  program  text,  which  must  be  totally  ordered. 

Like  plans  in  the  planning  literature  [48],  plans  in  the  Plan  Calculus 
provide  representation  at  different  levels  of  abstraction.  Symbolic  evaluation 
of  plans  in  the  Plan  Calculus  [53]  is  also  very  similar  to  techniques  used  in 
planning. 

Structural  and  Logical  Sublanguages 

The  Plan  Calculus  is  divided  into  a  structural  sublanguage  and  a  logical 
sublanguage.  The  structural  sublanguage  is  the  portion  of  the  Plan  Cal¬ 
culus  shown  in  plan  diagrams.  The  logical  sublanguage  comprises  the  pre¬ 
conditions,  postconditions,  and  other  logical  statements  which  annotate  the 
diagrains.  Some  applications  require  only  the  structural  part  of  the  Plan 
Calculus;  others  also  make  use  of  the  logical  sublanguage. 

The  following  sections  undertake  the  detailed  definition  of  the  Plan  Cal¬ 
culus  in  two  stages.  First  a  diagrammatic  notation  for  plans  is  introduced 
along  with  an  informal  description  of  its  semantics  in  terms  of  an  interpreter 
for  plan  diagrams.  Following  this  intuitive  introduction,  a  formal  syntax  for 
the  s'ructural  sublanguage  is  given. 
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Figure  13.  Examples  of  atomic  plan  diagram  elements:  an  input/output  specifi 
cation  (Set-Add),  a  test  specification  (Some),  and  a  join  specification  (Join). 

A  formal  semantics  for  the  complete  language  haa  been  developed  [10. 
41,  42],  but  is  beyond  the  scope  of  this  paper.  The  definition  of  the  logical 
sublanguage  is  also  omitted  here,  since  it  is  best  treated  within  a  larger 
discussion  of  the  formal  semantics. 


3.3  Plan  Diagrams 

A  plan  diagram  is  a  convenient  graphical  depiction  of  the  structural  portion 
of  the  Plan  Calculus.  Examples  of  the  atomic  elements  out  of  which  plan 
diagrams  are  composed  are  shown  in  Figure  13. 

Input/Output  Specifications 

The  box  labelled  Set-Add  in  Figure  13  is  an  example  of  an  iiijjitt /out put 
specification.  An  input/output  specification  is  drawn  as  a  rectangular  l»ox 
with  arrows  entering  at  the  top  (denoting  the  inputs)  and  leaving  from  the 
bottom  (denoting  the  outputs).  Each  in[)ut  or  output  is  lal<elle«]  with  a  iiam<' 
and  a  type,  separated  by  a  colon.  For  example,  .Set-.Add  has  two  inputs:  OM 
(of  type  Set)  and  Input  (of  type  .Any).  The  single  (nit|)ut  of  Set  .\. 1.1  is 
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callod  New  (of  type  Set).  The  names  of  tlie  inputs  and  outputs  within  a 
given  input/output  specification  must  be  unique.  However,  the  same  names 
may  be  reused  in  other  specifications. 

The  logical  portion  of  an  input/output  specification  associates  a  set  of 
precc  ndilions  and  postconditions  with  each  plan  diagram  box.  For  example, 
the  postconditions  of  Set- Add  state  that  the  New  set  includes  the  In|)ut 
ol)je<  t,  all  the  elements  of  the  Old  set,  and  no  others.  The  logical  sublanguage 
also  incliidc's  a  hierarchy  of  types,  in  which  Any  is  defined  as  the  most  general 
data  type. 

Test  Specifications 

file  box  labelled  Some  in  Figure  13  is  an  example  of  a  test  specification.  .A 
test  sjiecification  is  drawn  as  a  rectangular  box.  in  which  the  bottom  part  is 
divided  into  two  sides  labelled  “T”  and  “F”.  The  inputs  to  a  test  specification 
are  just  like  the  inputs  to  an  input/output  specification.  The  outputs  of  a 
test  specification  are  divided  into  two  groups.  Those  outputs  produced  when 
the  tv-st  succeeds  are  indicated  leaving  from  the  side  of  the  bottom  labelled 
“T";  those  outputs  produced  when  the  Uist  fails  are  indicated  leaving  from 
the  side  of  the  bottom  labelled  “F'b  For  example.  Some  has  two  inputs:  the 
lbiiv<;rse  (of  type  Set)  and  the  Critiuion  (of  type  Predicate).  The  output  of 
Some,  which  is  defined  only  when  the  test  succeeds,  is  called  Output  (of  type 
Any). 

As  with  input/oiitput  specifications,  the  logical  portion  of  a  test  speci¬ 
fication  associates  a  set  of  preconditions  and  postconditions  with  each  plan 
diagram  box.  For  example,  the  postconditions  of  Some  state  that  the  Output 
(when  it  is  produced)  is  a  member  of  the  Universe  and  that  the  Criterion  is 
true  of  it. 

Test  specifications  also  include  a  test  condition,  which  is  true  if  and  only 
if  tin  test  succeeds.  The  test  condition  of  Some  states  that  there  exists  an 
element  of  the  Universe  such  that  the  Criterion  is  true.  (The  generalization  of 
test  .specifications  to  /?  mutually  exclusive  test  conditions  is  straightforward.) 


Join  Specifications 

file  box  labelh'd  .Join  in  f'igure  13  is  an  examjile  of  a  join  specification.  .\ 
join  specification  is  drawn  as  a  rectangular  box  with  the  toj)  part  dividixl  into 
two  sides  labelled  “1"'  and  “F”.  ln))ut.s  to  a  join  sptH-ilication  are  indie died 


30 


THE  PLAN  CALCULUS 


Figure  14.  The  three  input/output  and  test  specifications  illustrated  at  the  left 
of  the  figure  are  combined  using  data  flow  to  construct  the  plan  diagram  at  the 
right.  Equality-Within-Tolerance  checks  whether  two  quantities  are  equal,  within 
some  tolerance. 
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entering  the  top  of  the  box;  outputs  leaving  from  the  bottom.  Join  specifi¬ 
cations  are  used  to  end  conditional  blocks  begun  by  test  specifications.  The 
inputs  to  n  join  specification  are  grouped  similarly  to  the  outputs  of  a  test 
specification.  The  inputs  on  either  the  “T”  or  the  “F”  side  are  consumed 
only  when  the  corresponding  branch  of  the  conditional  block  is  executed. 
The  output  of  a  join  specification  is  always  the  same  as  whichever  input  is 
consumed. 

The  join  specification  in  Figure  13  has  one  input  on  each  side  and  one 
outj)ut.  Other  join  specifications  may  have  several — but  the  same  number 
of-  inputs  on  each  side,  and  on  the  output.  (Join  specifications  can  be 
generalized  to  n  mututally  exclusive  cases  analogously  to  test  specifications.) 

Unlike  input/output  and  test  specifications,  join  specifications  do  not 
correspond  to  any  real  computation  in  the  final  program.  Rather,  they  are 
a  technical  artifact  used  in  well-formed  plans  to  specify  which  data  is  made 
available  for  further  computation,  depending  on  which  branch  of  a  conditional 
is  ex(*cutcd.  For  example,  the  logical  conditions  associated  with  Join  state 
that  the  Output  is  equal  to  the  True-Input  when  the  “T”  case  holds,  or  the 
False- Input  when  the  “F”  case  holds. 

Data  Flow 

Inpul/output  specifications,  test  specifications  and  join  specifications  are 
connected  together  to  form  plan  diagrams  using  two  kinds  of  structural  con¬ 
straints. 

The  first  kind  of  structural  constraint  is  data  How.  Data  flow  is  shown 
in  plan  diagrams  by  a  solid  arrow  connecting  an  output  of  one  box  with  an 
input  of  another.  Data  flow  arcs  may  fan  out  (i.e.,  there  may  be  several  arcs 
originating  at  a  given  output),  but  may  not  fan  in  (i.e.,  there  may  be  only 
one  arc  terminating  at  a  given  input).  .No  directed  cycles  are  allowed  (loops 
are  represented  using  tail  recursion,  as  described  below.) 

Figure  M  shows  a  simple  plan  diagram.  Equality- VV'ithin-Tolerance,  con¬ 
structed  using  data  flow.  Equality-Within-Tolerance  checks  whether  two 
(juantities  are  equal,  within  some  tolerance.  Each  box  in  a  plan  has  a  unique 
name,  so  that  multiple  occurrences  of  boxes  of  the  same  type  may  be  re¬ 
ferred  to  unambiguously.  These  names  are  called  the  roles  of  the  plan.  The 
roles  of  the  plan  Erpiality-Within-Tolerance  are  Subtract  (of  type  Difference), 
Normalize  (of  type  Absolute-V^alue),  and  Compare  (of  type  Less-Than).  To 
reduce  clulter  in  |)lan  diagrams,  the  names  and  type  restrictions  of  the  in- 
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(LET  ((X  (A. . .))) 
(D  (B  X)  (C  X))) 


(LET*  ((X  (A  ...)) 
(Y  (C  X))) 

(D  (B  X)  Y)) 


Figure  15.  In  the  plan  diagram  at  the  left,  data  flow  only  partially  constrains 
the  order  of  steps  in  the  computation.  Both  versions  of  the  code  at  the  right  are 
allowed  by  this  plan. 


Figure  10.  An  example  of  a  plan  with  a  data  flow  constraint  in  v.iii(  ii  two  input- 
are  “wired  together.” 
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PLAN  DIAGRAMS 

puts  and  outputs  of  boxes  are  usually  omitted,  since  they  can  be  found  by 
reference  to  the  definition  of  the  box  type. 

Data  flow  constraints  in  plan  diagrams  are  an  abstraction  of  the  various 
different  mechanisms  by  which  the  flow  of  data  can  be  achieved  in  different 
programs  and  in  different  languages.  These  mechanisms  include  nesting  of 
expressions,  use  of  intermediate  variables,  and  special  forms.  For  example, 
in  the  following  Lisp  code  for  Equality- Within-Tolerance,  all  the  data  flow  is 
achieved  by  nesting. 


«  (ABS  (- 


...))  ...) 


The  same  data  flow  could  also  be  coded  using  an  intermediate  variable, 
(LET  ((X  (- . ))) 

«  (ABS  X)  ...))) 

or  a  combination  of  nesting,  an  intermediate  variable,  and  a  special  form. 


(LET  ((Y  (PROG  ... 

(RETURN  (ABS  (- 

(<  Y  ...)) 


.)))))) 


Data  flow  constraints  also  provide  another  kind  of  abstraction  of  program 
text:  Any  order  of  steps  is  allowed  in  the  final  program,  as  long  as  it  is 
compatible  with  (i.e.,  is  a  completion  of)  the  partial  order  specified  by  the 
data  flow  (and  the  control  flow — see  the  following  section).  An  example  of  a 
partially-ordered  plan  and  two  final  programs  is  shown  in  Figure  15. 

A  slightly  different  kind  of  structural  feature  which  can  also  be  thought 
as  a  data  flow  constraint  is  illustrated  in  Figure  16.  In  this  plan,  the  inputs 
to  A  and  B  are  “wired  together.”  What  this  means  is  that  when  this  plan 
is  combined  with  other  plans,  the  data  flow  to  A  and  B  must  come  from  the 
same  output.  This  feature  also  appears  in  Figure  1 1  earlier  in  this  section. 

Control  Flow 

The  second  kind  of  structural  constraint  in  plans  is  control  How.  Control  flow 
is  shown  in  plan  diagrams  by  a  cross-hatched  arrow  between  an  exit  point 
of  one  box  and  an  entry  point  of  another.  Input/output  specifications  have 
a  single  entry  point  (at  the  top  of  the  box)  and  a  single  exit  point  (at  the 
bottom  of  the  box).  Test  specifications  have  a  single  entry  point  (at  the  top 
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compute-absolute-v&lue 

Figure  17.  A  plan  diagram  illustrating  control  flow  and  data  flow.  Comptite- 
Absolute- Value  computes  the  absolute  value  of  a  number. 


of  the  box)  and  two  exit  points  (one  on  eacli  side  of  the  bottom  of  the  box). 
Join  specifications  have  two  entry  points  (one  on  each  side  of  t!ie  toj)  of  ilie 
box)  and  one  exit  point  (at  the  bottom  of  the  box).  Control  flow  arcs  may 
both  fan  in  and  fan  out.  No  directed  cycles  are  allowed. 

Figure  17  shows  a  simple  plan  diagram,  Compule-Absolute-\’alue,  con¬ 
structed  using  control  flow  and  data  flow.  Compute-Absolute- Value  com¬ 
putes  the  absolute  value  of  a  number  by  negating  if  if  necessary. 

It  is  important  to  note  the  distinction  being  made  here  betwcon  .Misulutc' 
V''alue  and  Compute-Absolute- Value.  .Absolute- V^alue  is  an  input, '.lutput 
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(PROGN 

(A) 

(B) 

(C) 

(D) ) 


(PROGN 

(A) 

(C) 

(B) 

(D) ) 


Figure  18.  In  the  plan  diagram  at  the  left,  control  flow  only  partially  constrains 
the  order  of  steps  in  the  computation.  Both  versions  of  the  code  at  the  right  are 
allowed  by  this  plan. 


Figure  19.  These  two  plan  diagrams  have  the  same  meaning,  due  to  the  transi¬ 
tivity  of  control  flow. 
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specification  (used,  for  example,  in  the  plan  Equality-\Vithin-Toleraii<  e  in 
Figure  14),  which  has  the  postcondition  that  the  output  is  the  absolute  value 
of  the  input.  Compute-Absolute- Value  is  a  plan  (combination  of  steps)  which 
implements  this  specification.  In  general,  there  may  be  several  difTerenl  ]>lans 
which  implement  a  given  specification.  The  notion  of  a  plan  implementing  a 
specification  is  captured  in  overlays,  a  feature  of  the  Plan  Calculus  described 
below. 

Control  flow  constraints  in  plan  diagrams  are  an  abstraction  of  the  variou.s 
different  mechanisms  by  which  the  flow  of  control  can  be  achieved  in  different 
programs  and  in  different  languages.  These  mechanisms  include  nesting  of 
expressions,  sequencing  primitives,  and  special  forms.  For  example,  in  the 
following  Lisp  code  for  Compute-Absolute- Value,  the  necessary  control  flow 
is  achieved  using  the  special  form  IF. 

(IF  (MIMUSP  X) 

(SETQ  X  (-  X))) 

The  same  control  flow  is  achieved  in  a  more  complicated  way  in  the  following 
code  through  the  interaction  of  the  special  forms  COND,  PROG  and  RETURN. 

(PROG  . . . 

(COND  ((MINUSP  X)  . . .) 

(T  (RETURN))) 

(SETQ  X  (-  X))) 

Like  data  flow,  control  flow  constraints  also  provide  a  partial-order  ab¬ 
straction  of  program  text.  Conventional  programming  languages  do  not  dis¬ 
tinguish  between  the  necessary  orderings  between  program  stejjs  and  those 
that  are  chosen  arbitrarily.  In  the  Plan  Calculus,  any  order  of  steps  is  allowed 
in  the  final  program,  as  long  as  it  is  compatible  with  (i.e.,  is  a  complditui 
of)  the  partial  order  specified  by  the  data  and  (ontrol  flow.  Thus  a  cunt  nil 
flow  arc  between  box  A  and  box  i?  in  a  plan  diagram  does  not  mean  that  H 
immediately  follows  A,  but  rather  than  U  eventually  follows  A.  .An  examjile 
of  a  plan  partially-ordered  by  control  flow  and  two  possible  final  programs  i> 
shown  in  Figure  18. 

Unlike  data  flow,  control  flow  constraints  are  transitive.  For  example,  the 
two  plan  diagrams  in  Figure  19  have  the  same  meaning,  despite  the  fact  that 
they  have  different  syntax.  This  is  an  undesirable  lack  of  canonicalness  in 
the  structural  part  of  the  Plan  Calculus,  which  has  been  handled  by  building 
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knowledge  of  transitivity  into  the  programs  which  manipulate  the  diagrams. 
(See  Section  4  for  a  discussion  of  other  approaches  to  fixing  this  problem.) 

Finally,  note  that  notion  of  control  flow  used  here  hcis  a  different  flavor 
from  the  notion  used  in  typical  flowchart  languages.  In  the  Plan  Calculus, 
a  control  flow  arc  is  a  constraint  on  possible  execution  orders,  whereas  in  a 
typical  flowchart  language,  a  control  flow  arc  is  more  like  an  abstract  “jump” 
instruction. 

3.4  A  Parallel  Execution  Model  for  Plan  Diagrams 

The  meaning  of  a  plan  diagram  is  defined  formally  as  the  set  of  computa¬ 
tion  sequences  it  allows  (see  [40,  41,  42]).  A  useful  intuitive  model  for  plan 
diagrams,  however,  is  to  imagine  their  direct  execution  as  parallel  dataflow 
programs.  This  section  describes  a  set  of  rules  for  executing  plan  diagrams. 
A  symbolic  interpreter  for  plan  diagrams  along  these  lines  was  implemented 
by  Shrobe  [53]. 

Basically,  plan  diagrams  are  executed  by  having  “tokens”  flow  between 
boxes  along  the  data  and  control  flow  arcs  in  a  plan.  Boxes  consume  tokens 
at  the  top  and  produce  tokens  at  the  bottom.  The  tokens  that  flow  along 
data  flow  arcs  are  symbolic  objects  with  the  appropriate  properties.  The 
tokens  that  flow  along  control  flow  arcs  are  only  for  controlling  conditional 
execution,  and  have  no  other  properties.  Each  box  has  a  buffer  for  each 
input,  where  data  tokens  wait  until  they  are  consumed,  and  a  counter  for 
each  entry  point,  which  counts  how  many  control  tokens  have  arrived. 

Execution  begins  by  inserting  tokens  representing  the  starting  data  into 
the  input  buffers  of  the  data  flow  sources  of  the  diagram,  i.e.,  the  inputs 
of  boxes  that  have  no  incoming  data  flow  arcs.  Execution  then  proceeds  in 
parallel  according  to  the  activation  rules  for  each  kind  of  box. 

An  input/output  specification  is  activated  when  a  token  has  arrived  at 
each  of  its  incoming  arcs,  i.e.,  when  there  is  a  data  token  waiting  in  each  of 
its  input  buffers,  and  the  entry  point  has  counted  a  control  token  for  each 
incoming  control  flow  arc.  (If  there  are  no  incoming  control  flow  arcs,  then 
this  part  of  the  condition  is  satisfied  vacuously.)® 

^Notice  that  there  is  a  slight  asymmetry  here  between  data  flow  and  control  flow. 
Incoming  control  flow  arcs  fan-in  at  a  single  entry  point,  whereas  each  data  input  is 
allowed  only  a  single  incoming  data  flow  arc.  Another  way  of  thinking  of  this,  which 
resolves  the  asymmetry,  is  to  consider  each  incoming  control  flow  arc  to  have  a  separate 
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When  an  input/output  specification  is  activated,  if  the  in|)itt  data  sat¬ 
isfies  the  preconditions  of  the  specification,  then  output  data  sat i'^fyiiuj  tf  ■ 
postconditions  is  produced  at  each  output,  and  a  control  token  i"  pi  oduced  ,■. ! 
the  exit  point.  If  the  input  data  does  not  satisfy  the  i>rcconditions.  exe<  uiu  n 
terminates  abnormally. 

When  tokens  are  produced  at  an  otitput  or  exit  point,  they  are  piopai^aieo 
along  the  data  flow  and  control  flow  arcs  to  the  input  buH'ers  and  entrv  ixiint 
counters  of  the  connected  boxes.  Where  there  is  fan-out  of  data  flow  an 
the  intuitive  model  is  that  multiple  pointers  to  the  same  data  are  created.  a« 
opposed  to  multiple  copies.  (This  is  to  allow  modelling  of  side  cifei  ts  s<  < 
below.)  Where  there  is  fan-out  of  control  flow  arcs,  it  doesn’t  matter  whei  her 
you  copy  or  create  multiple  pointers,  since  control  tokens  have  no  disiiiu  i 
properties. 

A  test  specification  is  activated  the  same  way  as  an  input/ontput  speci¬ 
fication.  If  the  input  data  does  not  satisfy  the  preconditions  of  the  specifica¬ 
tion,  execution  is  terminated  abnormally.  If  the  input  data  does  satisfy  the 
preconditions  and  the  test  condition  is  true,  then  output  data  and  a  control 
token  are  produced  on  the  “T”  side  of  the  box;  otherwise  output  data  and  a 
control  token  are  produced  on  the  “F”  side  of  the  box. 

A  join  specification  is  activated  when  tokens  are  present  at  all  of  the 
incoming  arcs  of  one  or  the  other  side  of  the  box.  When  this  occurs,  a 
control  token  is  produced  at  the  exit  point,  and  the  appropriate  data  tokens 
are  passed  through  to  the  corresponding  outputs. 

Figures  20  and  21  show  an  example  execution  of  t  h('  Coinirute- .Absolute 
Value  plan. 

A  few  points  are  worth  noting  about  this  execution  nrodel.  First,  tlu' 
purpose  of  the  model  is  to  provide  intuition  into  the  meaning  of  the  diagrams, 
not  to  provide  a  formal  foundation.  For  example,  to  formally  jrrove  jrropr'i  t  ic's 
of  plans  (such  cis  whether  a  given  plan  terminates  normally  for  all  possible 
inputs)  requires  manipulating  the  logical  sublanguage  of  the  juecondit ions, 
postconditions,  and  test  conditions,  which  is  outside  of  this  execution  model. 

Second,  this  execution  model  is  particularly  easy  to  visualize,  lu'can.^e 
there  are  no  cycles  in  the  control  flow  or  data  flow.  This  does  not,  howev('r. 
allow  for  looping  computations.  The  next  st'ction  introduci's  hierarchical 
and  recursively  defined  plans,  which  are  used  to  model  loops  and  ixTursivt' 

“control  flow  input.”  (The  way  control  flow  fan-in  is  typically  ilrawn  m  )i|an  diacranis 
suggests  this  view.)  This  view,  however,  lia-s  the  luuie.siralile  |iro[>erty  that  the  niiinbcr  nf 
control  flow  inputs  is  not  fixed  fora  given  typ<- of  hox,  hut  chpeticls  on  tin’  coni.-.vi  of  use 


Finally,  note  that  the  Plan  Calculus  is  a  wide-spectrum  language  (this 
point  will  be  discussed  further  in  Section  4).  Depending  on  how  specific  the 
input  data  is,  and  whether  the  steps  of  the  plan  are  totally  ordered,  executing 
a  plan  can  range  from  being  equivalent  to  executing  a  conventional  program, 
to  being  the  symbolic  evaluation  of  a  specification. 

3.5  Hierarchical  Plans 

The  type  of  a  role  in  a  plan,  in  addition  to  being  an  atomic  element  (an 
input/output,  test,  or  join  specification),  may  also  be  a  plan.  This  makes  it 
possible  to  reuse  already  defined  cliche  to  build  larger  cliches  in  a  hierarchical 
fashion.  For  example.  Figure  22  shows  the  plan  Approx-and-Retry-Sqrt, 
which  has  the  plan  Equality- VVithin-Tolerance  (defined  earlier  in  Figure  14) 
a.s  one  of  its  subplans.  Approx-and-Retry-Sqrt  is  a  somewhat  contrived  plan 
that  computes  the  square  root  of  a  number  using  an  approximation  operation 
and  retries  the  approximation  only  once  if  necessary. 

Note  that  the  square-root  approximation  operation  (Approx-Sqrt)  in  Fig¬ 
ure  22  has  an  extra  input  (Limit)  specifying  the  maximum  number  of  steps 
tt)  In'  used  in  the  approximation.  If  the  result  of  the  operation  is  not  within 
tolerance  (Check),  the  iteration  limit  is  increased  (Increase)  and  the  approx¬ 
imation  is  tried  again  (Retry).  The  role  Check  is  itself  a  plan,  Equality- 
Wilhin-Tolerance,  with  roles  Subtract,  Normalize  and  Compare.  Note  that 
the  formula  used  to  compute  the  new,  increased  limit  from  the  old  limit  and 
the  absolute  value  of  the  error  is  not  specified  in  this  diagram. 

W  ithin  hierarchical  plans,  it  is  convenient  to  refer  to  parts  at  different 
levels  in  the  structure  by  composing  role  names  into  paths.  For  example,  in 
the  plan  Approx-and-Retry-Sqrt,  the  path  Approx. Limit  refers  to  the  Limit 
input  of  the  Approx  role.  Similarly,  Check. Compare.Lesser  refers  to  the 
Lesser  input  to  the  Compare  role  of  the  Check  role. 

.Notice  in  Figure  22  that  a  dashed  box  is  drawn  around  the  parts  of  a 
subplan.  This  boundary  is  not,  however,  a  barrier  to  establishing  connec- 
ticuis  between  the  parts  of  the  snbplan  and  the  surrounding  plan.  Inputs 
to  intermediate  steps  of  a  subplan  can  be  provided  from  the  surrounding 
|)lan  and  internu'diate  results  can  be  "tapped.''  For  example,  note  the  data 
flow  connection  between  the  output  of  Check. Normalize  and  the  input  of  the 
Increase  step. 
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approx-ajid-retry-sqrt 

Figure  22.  An  exampl*'  of  a  lii^'ran  liical  plan. 
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indexed-sequence 


Figure  23.  An  example  of  a  data  plan  and  the  corresponding  accessors.  The 
accessors  are  implicitly  defined  as  part  of  the  definition  of  the  data  plan. 
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newdndexed-sequence 


bump-and-update 


Figure  24.  An  example  of  a  hierarchical  plan  with  a  mixt  urc  of  data  and  conii)iiia 
tion  roles.  The  plan  Bump-and-Update  captures  the  cliched  pattern  of  o|)eiations 
on  an  indexed  sequence  in  which  the  index  is  increnienti'd  (Bump)  and  a  new  term 
is  stored  (Update). 
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Figure  25.  An  equivalent  version  of  Bump-and-Update  (see  Figure  24),  in  which 
explicit  accessors  have  been  used  instead  of  using  data  plans. 
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Data  Plans 

The  type  of  a  role  in  a  plan  can  also  be  a  primitive  data  t\p<,  such  as 
Integer,  Sequence,  or  Set.  A  plan  all  of  whose  roles  are  data  types  (or  hierar¬ 
chically,  data  plans)  is  called  a  data  plan.  Data  plans  are  used  to  represent 
standard  data  structure  aggregations  which  appear  in  the  implenien'ai mu 
of  more  abstract  data  types.  For  example.  Figure  23  shows  the  data  plan 
Indexed-Sequence,  which  represents  the  common  cliche  of  a  sequence  (Base) 
with  an  associated  index  pointer  (Index).  The  Base  is  typically  implemented 
more  concretely  as  an  array.  This  data  plan  is,  for  instance,  part  of  man\- 
implementations  of  buffers,  queues,  and  stacks. 

The  logical  portion  of  a  data  plan  associates  an  invariant  with  the  data 
aggregation.  For  example,  the  invariant  of  Indexed-Sequence  states  that  the 
Index  must  be  greater  than  or  equal  to  zero  and  less  than  or  ecpial  to  the 
length  of  the  Base. 

The  definition  of  a  data  plan,  such  as  Indexed-Sequence,  automatically 
defines  a  corresponding  collection  of  input/output  specifications  for  the  stan¬ 
dard  data  structure  accessors: 

•  A  constructor,  which  takes  an  instance  of  the  appropriate  type  for  each 
of  the  roles,  and  produces  a  new  instance  of  the  data  plan  with  those 
parts.  A  precondition  of  this  operation  is  that  the  inputs  satisfy  the 
invariant  of  the  data  plan. 

•  A  selector  for  each  role,  which  takes  an  instance  of  the  data  jdaii.  and 
returns  the  corresponding  part. 

•  An  alterant  for  each  role,  which  lake  an  instance  of  the  data  ])lan  and  an 
instance  of  the  appropriate  type  for  the  role,  and  destructively  modifies 
the  instance  of  the  data  plan  by  replacing  the  corresponding  part  with 
the  new  part.  A  precondition  of  this  operation  is  that  the  new  part 
together  with  the  old  parts  for  the  other  roles  satisfy  the  invariant  of 
the  data  plan. 

The  naming  conventions  for  these  accessors,  their  inputs,  and  their  outputs, 
are  illustrated  in  Figure  23. 

Implicit  Accessors 

In  general,  a  hierarchical  plan  may  have  a  mixture  of  data  and  comjuitai  ion 
roles.  Figure  24  shows  an  example  of  a  hierarchical  plan.  Bump-and  l  iHl.iie, 
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Figure  26.  For  a  data  plan,  D,  with  roles  . . . ,  X,  . . this  figure  shows  how  each 
l)ossil)lc  arrangement  of  data  flow  to  a  single  role  (on  the  left)  is  translated  into 
explicit  accessors  (on  the  right).  See  Figure  27  for  more  general  case. 
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Figure  27.  For  a  data  plan,  D,  with  roles  . . . ,  A  ,  . . . .  V . t  liis  f:p;iin'  sliows 

how  various  combinations  of  data  flow  to  inultiplp  roles  (on  tlip  li'ft )  is  t ranslati'd 
into  explicit  accessors  (on  the  right).  See  Figure  26  for  simpler  ca.ses. 
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which  has  the  data  plan  Indexed-Sequence  as  a  subplan  (twice).  This  plan 
expresses  the  cliched  pattern  of  operations  on  an  indexed  sequence  in  which 
tlie  index  is  incremented  and  a  new  term  is  stored  at  that  location  in  the 
secpience,  as  for  example,  in  the  following  code; 

(DEFSTRUCT  INDEXED-SEQUENCE  BASE  INDEX) 

(LET  ((I  (1+  (INDEXED-SEQUENCE-INDEX  Q))) 

(S  (CQPY-SEQ  (INDEXED-SEQUENCE-BASE  Q)))) 

(SETF  CELT  S  I)  ITEM) 

(MAKE-INDEXED-SEQUENCE  :BASE  S  ; INDEX  I)) 

Notice  that  the  Bump-and-Update  plan  is  purely  functional,  i.e.,  there  are 
no  side  effects.  New-Term  (the  type  of  the  Update  step)  is  a  predefined  in¬ 
put/output  specification  associated  with  the  primitive  data  type  Sequence— 
it  returns  a  copy  of  the  input  sequence,  with  one  term  changed.  Since  there 
is  no  sequence  primitive  in  Lisp  corresponding  to  New-Term,  the  code  above 
uses  a  combination  of  CQPY-SEQ  and  SETF  of  ELT  to  implement  this  operation. 
A  related  version  of  this  plan  which  uses  side  effects  is  discussed  below. 

Notice  also  that  the  selector  and  constructor  operations  in  the  code  above 
for  Dump-and-Update  do  not  appear  explicitly  as  boxes  in  the  plan  diagram. 
It  is  a  convenient  feature  of  plan  diagrams  that  these  accessors  are  implicit  in 
the  way  data  flow  is  connected  to  the  roles  of  a  data  plan.  For  example,  the 
version  of  Bump-and-Update  in  Figure  24  can  be  taken  as  an  abbreviation 
fur  the  version  in  Figure  2-5,  in  which  the  accessors  are  made  explicit.  The 
general  rules  fur  interpreting  data  flow  involving  data  plans  are  illustrated  in 
Figures  26  and  27. 

3.6  Side  Effects 

Side  effects  are  modelled  in  the  Plan  Calculus  by  introducing  input/output 
specifications  which  destructively  modify  their  inputs.  For  example,  the  de¬ 
structive  version  of  New-Term,  called  Alter-Term,  has  the  same  input  and 
output  rohs  as  New-Term.  Its  postconditions,  however,  specify  that  the  Old 
sequence  is  ckstructively  modified  to  obtain  the  New  sequence.  (The  formal 
statement  of  this  condition  involves  using  a  situational  calculus  for  modelling 
mutable  objc'cts — see  [41,  42,  40].) 

Figure  28  shows  an  example  of  a  plan,  called  Destructive-Bump-and- 
I  ixlate.  involving  side  effects.  I'liis  jrlan  is  the  more  common,  destructive 
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version  of  Bump-and-Update,  corresponding  to  the  code  below.  (Cross- 
referencing  between  the  destructive  and  non-destructive  versions  of  speci¬ 
fications  and  plans  is  part  of  the  library  structure  [39].) 

(LET  ((I  (1+  (INDEXED-SEQUENCE-INDEX  Q)))) 

(SETF  (ELT  (INDEXED-SEQUENCE-BASE  Q)  I)  ITEM) 

(SETF  (INDEXED-SEQUENCE-INDEX  Q)  I)) 

.Notice  that  the  plan  diagram  for  Destructive-Bump-and-Update  has  ex¬ 
plicit  accessors,  such  as  Alter-Indexed-Sequence-Index,  for  the  parts  of  the 
indexed  seciuence.  The  abbreviated  data  flow  notation  for  data  plans  de¬ 
scribed  above  cannot  be  used  in  plans  with  side  effects  because  the  correct 
expansion  of  the  abbreviations  in  the  presence  of  side  effects  requires  non¬ 
local  reasoning.  For  example,  in  Destructive-Bump-and-Update,  there  is  no 
alterant  for  the  Base  of  the  indexed  sequence,  because  the  destructive  mod¬ 
ification  of  the  sequence  in  the  Update  (Alter- Term)  step  also  achieves  a 
destructive  modification  of  the  whole  indexed  sequence  of  which  it  is  a  part. 

In  the  Plan  Calculus,  side  effects  arise  only  in  connection  with  the  de¬ 
structive  modification  of  arrays,  records,  and  other  mutable  data  structures. 
Most  of  the  side  effects  in  conventional  programming  languages,  namely  as¬ 
signment  statements,  are  replaced  by  the  u.se  of  data  flow  in  the  Plan  Calcu¬ 
lus.  (.Xn  exception  is  the  use  of  global  variables,  whose  current  value  is  best 
thought  of  as  part  of  the  state  of  the  system.  These  are  modelled  using  the 
primitive  mutable  data  plan,  Cell,  which  has  a  single  role,  called  Contents.) 

In  general,  reasoning  about  side  effects  can  be  quite  complex,  especially 
if  mutable  objects  may  overlap  (see  [53,  54j). 

3.7  Recursively  Defined  Plans 

Hierarchical  plans  can  be  recursively  defined,  i.e.,  the  type  of  one  or  more 
of  the  subplans  can  be  the  same  as  the  type  of  the  plan.  For  example, 
!•  igure  ‘29  shows  the  recursive  data  plan  defining  the  standard  list  and  binary 
1  Tf'e  abstractions. 

becursive  computations  are  also  represented  using  recursive  plan  defi¬ 
nitions.  for  example,  I'igure  30  shows  the  recursively  defined  plan,  calh-d 
Bint ree-Knumeralion.  for  enumerating  (visiting  every  node  of)  a  binary  trc'e. 
In  the  usual  Lisp  implementation  of  binary  trees  as  cons  cells,  the  following 
(  ode  is  an  implem(>ntation  of  this  plan. 
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(DEFUN  ENUMERATE  (TREE) 

(UNLESS  (ATOM  TREE) 
(ENUMERATE  (CAR  TREE)) 
(ENUMERATE  (CDR  TREE)))) 


Notice,  however,  that  this  code  makes  an  ordering  commit inent  that 
not  required  by  the  Bintree-Enumeration  plan.  In  this  code,  the  nodc-s  of 
the  tree  are  walked  in  left-to-right  order  (assuming  CAR  corresponds  to  Left 
and  CDR  to  right).  The  Bintree-Enumeration  plan  is  more  general — it  does 
not  force  the  traversal  to  occur  in  any  particular  order.  An  advantage  of  the 
Plan  Calculus  over  conventional  program  text  is  that  it  allows  the  expression 
of  more  general  cliche,  such  as  this.  Furthermore,  to  constrain  the  Bintrec- 
Generation  plan  to  the  traversal  order  used  in  the  code  above,  all  that  is 
required  is  to  add  a  control  flow  arc  from  Continue-Left.End  to  Continue- 
Right.Exit. 


taiI:Hst  U  null 

:o' 


list 


Ieft:bintree  U  atom 


right.-bifitree  U  atom 

I - 


J 


bintree 

Figure  29.  Two  examples  of  recursively  defined  d.Ua  plans.  .Note  the  use  of 
disjunctive  types. 


list-enumeration 

Figure  31.  Iterative  ( tail-recursive)  plan  for  enunieratin"  a  list 


-■.v. 
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Iterative  Computations 

Iterative  computations  are  represented  in  the  Plan  Calculus  by  recursively 
defined  plans.  For  example,  Figure  31  shows  the  plan  for  enumerating  the  el¬ 
ements  of  a  list.  In  the  standard  implementation  of  lists  in  Lisp,  the  following 
code  i.s  an  implementation  of  this  plan. 

(LOOP 

(IF  (NULL  L)  (RETURN)) 

. . .  (CAR  L)  . . . 

(SETQ  L  (CDR  L))) 

This  is  the  familiar  CAR,  CDR,  NULL  cliche  that  appears  in  several  different 
syntactic  forms  in  the  hash  table  example  of  Section  2.  This  cliche  can 
alternatively  be  coded  in  the  following  recursive  form,  which  mirrors  more 
closely  the  structure  of  the  plan  in  Figure  31. 

(DEFUN  ENUMERATE  (L) 

(WHEN  L 

. . .  (CAR  L)  . . . 

(ENUMERATE  (CDR  L)))) 

The  two  versions  of  the  code  above  are  computationally  equivalent.  In 
both  cases,  the  amount  of  memory  used  in  the  computation  does  not  need  to 
grow  with  each  repetition  of  the  body.  (It  is  a  defect  of  some  compilers  and 
interpreters  that  these  two  versions  are  not  executed  in  the  same  way.)  A 
recursive  definition  that  corresponds  to  an  iterative  computation  is  often  re¬ 
ferred  to  as  tail-recursive.  Although  iterative  computations  are  often  loosely 
referred  to  as  ‘‘loops”,  the  essential  characteristic  of  iteration  is  not  the  ex¬ 
istence  of  a  cycle  in  control  flow,  but  rather,  the  fixed  space  requirements  of 
the  computation. ' 

The  difference  between  a  singly-recursive  plan  that  gives  rise  to  an  iter¬ 
ative  computation,  and  one  that  gives  rise  to  a  recursive  computation  has 
to  do  with  whether  there  are  any  operations  to  be  performed  “on  the  way 
up”,  i.e.,  after  the  recursive  invocation.  This  point  can  be  illustrated  by 
comparing  plans  for  the  recursive  versus  iterative  computation  of  factorial, 

A  plan  for  the  recursive  computation  of  factorial  is  shown  in  Figure  32. 
This  plan  corresponds  to  the  following  code. 

'For  a  further  disciis.sion  of  the  relationship  between  iteration,  recursive  definition,  and 
looping  constructs,  see  [1],  pp.  32-33. 
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iterative-factorial-body 

Figure  33.  Iterative  (tail-recursive)  plan  for  the  computation  of  factorial.  Note 
that  an  auxiliary  plan  definition  (not  shown  here)  is  required  to  specify  initializa¬ 
tion  of  the  accumulated  product  to  1. 
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(DEFUN  FACT  (N) 

(IF  (=  N  1) 

1 

(♦  N  (FACT  (1-  N))))) 

Note  that  the  multiplication  (Accumulate)  step  in  this  plan  requires  input 
from  the  end  of  the  recursive  invocation,  and  therefore  must  come  after  the 
recursion.  This  computation  is  not  iterative,  but  linear  recursive  -  inemorv 
grows  linearly  with  the  number  of  repetitions  of  the  body. 

A  tail-recursive  plan  for  the  iterative  computation  of  factorial  is  shown 
in  Figure  33.  This  plan  corresponds  to  the  following  recursive  definition. 

(DEFUN  FACT- I TER  (N  F) 

(IF  (=  N  1) 

F 

(FACT-ITER  (1-  N)  (♦  N  F)))) 


Factorial  of  n  is  computed  by  calling  FACT-ITER  with  tlie  accumulated  jiroduct 

(F)  initialized  to  1.  t 

\4 

(DEFUN  FACT  (N) 

(FACT-ITER  N  1)) 

This  can  alternatively  be  coded  as  a  loop,  as  follows. 

(DEFUN  FACT  (N) 

(LET  ((F  D) 

(LOOP 

(IF  (=  N  1)  (RETURN  F)) 

(SETQ  N  (1-  N)) 

(SETQ  F  (♦  N  F))))) 


Notice  that  m  the  plan  in  Figure  33  there  are  no  computations  to  be  per¬ 
formed  after  the  recursive  invocations.  (Joins  do  not  ( dunt  as  computations, 
but  arc  really  part  of  the  data  and  control  How  constraints.) 

Another  example  of  a  tail  recursiv*-  plan  is  the  bineai -Sean  h  plan  in 
F’igure  11,  wliich  captures  the  lin<-ar  search  cliche  us<‘<l  iu  the  hash  table 
example.  A  taxonomy  of  iterative  cliches  has  been  develo|)ed  iiy  Watc'is  [olb 
and  elab(;rated  by  Rich  [3*J]. 
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3.8  Overlays 

Programming  knowledge  includes  understanding  many  kinds  of  relationships 
between  plans.  One  important  kind  of  relationship  is  how  an  instance  of 
one  plan  can  be  viewed  as  an  instance  of  another.  Overlays  are  the  general 
facility  in  the  Plan  Calculus  for  representing  such  shifts  of  viewpoint.  Ex¬ 
ample's  of  overlays  given  below  capture  the  common  programming  notions  of 
implementing  a  specification,  data  abstraction,  and  optimization. 


Implementing  a  Specification 

Figure  34  is  an  example  of  a  simple  overlay  representing  implementation 
kno'vledge.  The  right  side  of  the  diagram  is  the  Absolute- Value  input/output 
specification.  The  left  side  of  the  diagram  is  the  plan,  Compute-Absolute- 
\'ahie,  which  tests  whether  a  number  is  negative  and,  if  so,  negates  it.  This 
overlay  represents  the  fact  that  the  Compute-Absolute-Value  plan  is  a  correct 
implementation  of  the  Absolute- Value  specification.  (A  statement  of  the  cor¬ 
rectness  conditions  is  given  below.)  Notice  the  distinction  being  made  here 
between  the  specification  for  absolute  value,  and  one  way  of  computing  it, 
even  though  these  two  are  very  close  in  this  example.  Although  Compute- 
.Absolute- Value  is  the  most  obvious  way  of  implementing  Absolute- Value, 
there  are  other  possible  ways — for  e.xample,  squaring  the  number  and  then 
taking  the  square  root.  Each  way  of  implementing  Absolute-Value  is  repre¬ 
sented  by  a  different  overlay,  all  of  which  have  the  same  right  side. 

In  addition  to  a  left  and  right  side,  an  overlay  diagram  also  includes  a 
set  of  hookc-d  lines,  called  correspondences,  which  identify  the  corresponding 
objec  ts  in  the  two  points  of  view.®  In  Figure  34,  for  example,  the  correspon¬ 
dences  identify  the  input  of  the  absolute  value  specification  with  the  input 
of  the  test  of  the  implementation  plan,  and  the  output  of  the  absolute  value 
sijecification  with  the  output  of  the  join  of  the  implementation  plan. 

Formally,  an  overlay  defines  a  mapping  from  the  set  of  instances  of  the 
left  side  plan  (the  domain)  to  the  set  of  instances  of  the  right  side  plan 
(the  range).  There  may  be  different  overlays  with  the  same  domain  and/or 
range.  In  order  to  be  correct,  the  mapping  defined  by  an  overlay  must  be 
single- valued,  total  and  onto.® 

'’The  idea  of  correspondences  was  stimulated  in  part  by  Sussman’s  “slices”  [57],  which 
he  used  to  represent  equivalences  between  electronic  circuits. 

’A  mapping  is  unto  iff  each  element  of  the  range  is  the  image  of  some  element  of  the 
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The  single- valued  condition  guarantees  that  the  iir.plementation  process 
loses  no  information,  i.e.,  for  a  given  overlay,  the  specification  can  always  be 
recovered  from  the  implementation.  The  mapping  may.  however,  be  man>  - 
lo-one,  so  that  the  implementation  typically  is  not  uniquely  determined  by 
the  specification. 

ddie  total  condition  guarantees  that  each  implementation  instance  corre¬ 
sponds  to  some  specification.  (Typicall}',  this  is  achieved  by  re.strirting  the 
domain  of  the  overhu'  until  this  condition  is  satisfied.) 

Finally,  the  onto  condition  guarantees  that  each  specification  is  imijh'- 
menfable. 

The  logical  sublanguage  and  the  formal  semantics  of  the  I’lan  Calculus 
provide  the  basis,  in  princi[)le,  to  formally  verify  all  of  tlie  overlays  described 
in  this  paper.  .4n  automated  proof  system  which  can  be  used  for  this  task  has 
lieon  implemented  by  Feldman  and  Rich  [44,  15).  Tims  far,  however,  these 
conditions  have  only  been  used  as  an  intuitive  guide  to  writing  overlays. 

Using  Overlays  in  Analysis  and  Synthesis 

The  knowledge  encoded  in  an  overlay  can  be  used  in  both  analysis  and  syn¬ 
thesis  of  programs.  In  analysis  by  inspection,  the  left  side  of  an  overlay  is 
matched  against  the  plan  representation  of  the  program  under  analysis.  If 
a  match  is  found,  then  the  part  of  the  plan  matching  the  left  side  of  tlie 
overlay  can  be  replaced  by  the  right  side  of  the  overlay.  The  correspondences 
provide  the  information  needed  to  connect  the  right  side  of  the  overlay  with 
the  appropriate  parts  of  the  surrounding  plan.  (See  example  in  F'igure  35.) 

'I  he  repeated  application  of  this  recognition  process  can  be  thouglit  of 
as  a  kind  of  parsing,  where  each  overlay  defines  a  grammar  rule.  (The  sides 
are  rcveised:  The  right  side  of  tlie  overlay  corresponds  to  tire  reduced  side  of 
th<'  grammar  rule;  the  left  side  of  the  overlay  corresponds  to  the  expansion 
id'  the  rule.)  IVote  that  tins  grammar  will  typically  be  ambiguous, because 
there  may  be  several  overlays  with  the  same  left  side,  and  also  because  the 
jrarts  of  a  plan  may  often  be  grouped  in  several  different  ways.  Wills  [Gd] 
has  constructed  an  automated  system  which  performs  analysis  by  inspection 
using  a  graph-parsing  approach. 

doiiiaill. 

'  'A  gtaiiiiiiar  is  aiiibigiious  ilf  some  seii(eiice.s  in  tlie  language  do  not  have  a  uniijuo 
ih’riv.il  iciii. 


Figure  36.  An  example  of  mateliing  a  plan  in  which  copying  is  required  before 
replacement,  d'he  part  of  the  larger  plan  matching  the  left  side  of  the  overlay 
in  Figure  35  is  highlighted  in  bold.  Notice  that  A  is  roi)ic(l  first,  and  then  the 
mail  lied  part  of  the  jilan  is  rejilaced  by  the  right  side  of  the  overlay. 
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In  synthesis  by  inspection,  the  right  side  of  an  overlay  is  niat<^  !i('d  again-' 
the  plan  representation  of  the  current  synthesis  state.  If  the  'ighi  .-i.K 
the  overlay  is  a  single  box,  as  in  the  case  of  implementation  o.-eilay^,  tm  ,i 
this  is  trivial.  We  will  see  below  that  the  right  side  can  also  i)e  a 
The  part  of  the  plan  matching  the  right  side  of  the  overlay  is  ''  ' 

the  left  side  of  the  overlay,  again  using  the  correspondences  to  get  the  right 
connections.  (See  Figure  35.)  In  the  grammar  metaphor,  syiit lie.sis  l.y  u 
spection  corresponds  to  running  the  same  grammar  as  a  generator.  .\  systi  iu 
which  supports  a  kind  of  synthesis  by  inspection  has  been  impleinenleil  by 
Waters  [61]. 

Note  that  in  the  process  of  matching  and  replacement.  ])arts  of  ilie 
matched  plan  may  need  to  be  copied  before  replacement  is  made.  The  part.' 
of  the  matched  plan  that  need  to  be  copied  are  any  operations  or  tests  whose 
output  has  data  flow  going  outside  the  matched  area,  and  for  which  there  is 
no  corresponding  output  on  the  other  side  of  the  overlay.  Figure  36  shows  an 
example  of  when  copying  is  required  in  the  use  of  an  overlay  in  anal3  sis  b\ 
inspection.  The  same  copying  would  be  required  in  the  synthesis  direction  if 
the  same  plan  were  the  right  side  of  another  overlaj-. 

Data  Abstraction 

Data  abstraction  is  represented  in  the  Plan  Calculus  by  overlays  betwivn 
data  plans.  The  data  plan  on  the  left  side  of  the  overlay  is  what  is  typicallv- 
called  the  concrete  (or  implementation,  or  representation)  data  t\pc;  the 
data  plan  on  the  right  side  of  the  overlay  is  tiie  abstract  data  tyja'.  .\s  with 
overlays  in  general,  a  data  overlay  must  define  a  single- vahnxl,  total  and  mito 
mapping  from  instances  of  the  concrete  data  t\  pe  to  instances  of  the  al»straet 
data  type.  This  niairping  is  typically  called  the  abstract  inn  function  in  the 
data  abstraction  literature  (e.g.,  [30]). 

Only  the  domain  and  range  types  of  a  data  overlay  can  be  indicated 
in  plan  diagrams.  The  definition  of  the  abstraction  function  retiuiie-  the 
logical/mathemat ical  sublanguage.  For  exampha  hdgure  .37  shows  the  d.it.i 
overlay,  Indexed-Sequence-as-List,  which  rejuesenls  one  wa\-  of  iinplenn'iit  inc 
a  list  using  an  indexed  sequence,  dhe  abstraction  function  foi  Inde.xtd 
.Se<iuence-as-List  is  defined  as  follows:  The  head  of  the  list  con espuiids 
the  term  of  the  base  sef|uence  iinh-xed  by  th<’  index.  The  tail  of  tlie  li-i 
is  recursively  defined  as  the  list  implemented  by  the  indexed  se(|uence  witii 
same  sequence  and  one  minus  the  index.  The  empty  list  (nil)  corre-pemd'  lo 
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ta.il:list  U  null 


indexed-sequence  '  list 

I 

indexed-sequence- as-Iis  t 

Figure  37.  An  example  of  implementation  knowledge  involving  data  abstraction. 
The  data  overlay,  Indexed-Scqucncc-as-List.  specifies  how  to  implement  a  list  using 
a  sequence  and  an  index.  Only  the  domain  and  range  are  indicated  in  the  plan 
diagram. 

the  indexed  sequence  with  index  zero. 

Data  overlays  are  typically  used  to  define  other  overlays.  For  example, 
Figure  38  shows  the  definition  of  an  overlay  which  describes  how  to  imple¬ 
ment  the  Push"  operation  on  a  list,  when  the  list  is  implemented  as  an 
indexed  sequence  (according  to  Indexed-Sequence-as-List).  The  left  side  of 
this  overlay  is  the  Dump-and-Update  plan  introduced  earlier.  In  this  im- 
idementation,  the  Old  and  New  indexed  sequences  of  the  Bump-and-Update 
plan  corres])ond  to  the  Old  and  New  lists  of  Push,  respectively.'^  The  object 
svhich  becomes  the  new  term  in  Buinp-and-Update  corresponds  to  the  object 
lieing  ijushcd  onto  the  list. 

Notice  that  two  of  the  correspondences  in  the  diagram  for  Bump-and- 
l,’l)date-as-Push  in  Figure  38  are  annotated  with  the  name  of  the  data  o\’er- 
lav  Indexed -.Sc(iuence-as-List.  Tliis  means  that  the  Old  indexed  secpience 
of  Bump-and-Update  viewed  as  a  list  according  to  Indcxcd-Sequence-as-List 
corresponds  to  the  Old  input  of  Push,  and  similarly  for  the  New  roles.  This 

“Tiio  postcoiulilions  of  Push  state  that,  (he  head  of  the  New  list  is  equal  to  the  Iu|)ut, 
and  the  tail  of  the  New  list  is  equal  to  the  Old  list. 

'-’KiTall  that  the  plan  diagratii  shown  for  Huiiip-and-Updale  in  Figure  38  is  actually 
an  ahlireviatioii  for  the  version  with  explict  acce.ssors  shown  in  Figure  25.  With  explicit 
ai  iis>(u.s  on  the  left  side  of  overlay,  the  correspondence  involving  the  Old  indexed  sequence 
would  connect  to  the  input  to  the  selectors  at  the  top  of  the  plan;  the  correspondence 
invoh  ittg  (lie  New  indexed  sequence  would  connect  to  otitput  of  the  constructor  at  the 

I  lot  t  (  .111. 


lalii'llin^  convention  is  quite  general.  .Any  correspondence  can  be  labelk 
with  the  name  of  any  function  having  the  appropriate  domain  and  rang 
1  liis  nu.'ans  that  this  function  is  applied  to  the  object  on  the  left  to  ol 
tain  the  corresponding  objwt  on  the  right.  One  can  think  of  an  unlabelh 
tan n  spundence  as  meaning  the  identity  function. 

-Xotice  that  using  data  overlays,  the  same  data  abstraction  can  be  impl 
ni'iited  differently  in  different  contexts;  this  is  awkward  in  some  prograr 
min<j,  languttges. 

f  inally,  notice  that  the  implementation  knowledge  in  Figure  .38  is  f 
the  :ii(jst  abstract  case,  namely  an  unbounded  list  implemented  using  ; 
unbounded  sef[ucnce,  without  side  effects  (the  input  and  output  lists  of  Pu; 
are  not  identical;  New-term  is  the  non-destructive  operation  on  sequences 
.A  plm  library  would  also  include  overlays  between  versions  of  these  plans 
whi<  h  the  Push  oiteration  can  cause  overflow,  the  base  sequence  has  a  fix 
length,  atid  various  operatii)ns  are  destructive. 

0]jt  iniizatioii 

file  most  general  form  of  overlay  has  a  non-atomic  plan  diagram  on  ea 
side.  Snell  overlays  are  most  often  used  to  capture  optimization  knowledj 
For  example.  Figiiix'  39  shows  an  overlay  having  to  do  with  optimizing 
certain  pattern  of  operations  on  a  list.  The  right  side  of  this  overlay  is  a  pi 
in  which  an  object  is  pusluxl  onto  a  list,  the  list  is  sorted,  another  objc 
is  pushed  onto  the  sorted  list,  and  then  it  is  sorted  again.  This  pattern 
o])erations  can  be  oplirniztxl  as  shown  by  the  plan  on  the  left  side  of  t 
o\erla  v,  in  which  the  first  sorting  operation  is  omitted.  One  can  think  of  ti 
o\('i  lay  as  em!«.)d  ving  a  small  lemma  in  the  theory  of  lists  and  sorting. 

One  would  not  particularly  expect  a  programmer  to  write  code  mat( 
iiig  the  right  side  of  this  overlay.  However,  jiatterns  recpiiring  optimizati 
(  an  easily  arise  in  the  |)roc('ss  of  automated  s\'nlhesis,  when  higher  le 
(iperat  ions  are  exi>and(‘d  into  imiilementat  ions.  For  example,  a  simple  imj- 
mentation  hu'  adding  an  object  to  a  sorted  list  is  to  jnish  the  object  onto  t 
list  and  then  sort.  1  wo  such  operations  on  the  same  sorted  list  implement 
thi^  way  Would  give  rise  to  th<‘  pattern  on  the  right  side  of  this  overlay. 

I  sing  an  overlay  such  as  Figure  39  in  the  synthesis  direction,  i.e.,  mat( 
ing  the  right  side  and  replacing  it  by  the  left  side,  amounts  to  applying 
optimization.  Using  an  overlay  such  as  Figure  39  in  the  analysis  directi( 
i.e..  matr  liing  the  left,  side  and  rejdacing  it  by  the  right  side,  amounts 
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■■unduing'’  an  optimization.  It  i.s  often  necessary  to  undo  optimizations  in 
order  to  facilitate  furtlier  recognition. 

In  tlie  grammar  metaphor,  an  overlay  with  non-atomic  plans  on  both  the 
h  it  ;ind  right  side  corresponds  to  a  context-sensitive  grammar  rule.  Undo¬ 
ing  ojjtirnizations  as  part  of  recognition  is  therefore  an  inherently  expensive 
procf'ss. 


3.9  Summary 

Ihis  section  summarizes  the  structural  sublanguage  of  the  Plan  Calculus 
with  a  form.al  definition  of  its  syntax.  Note  that  the  syntax  of  plan  diagrams 
allows  i)lans  that  are  not  semantically  well-formed,  for  example,  for  which 
no  possible  executions  exist  (see  [40,  41,  42]  for  more  on  semantics). 

We  begin  witli  a  set  of  primitive  types,  which  are  in  the  language.  These 
types  provide  the  primitive  data  vocabulary,  such  as  Integer,  Sequence,  and 
Set.  out  of  which  specifications  are  built.  The  primitive  type  Situation  is 
used  to  model  control  flow  and  side  effects. 

There  are  two  kinds  of  composite  structures  in  the  language:  specifica¬ 
tions  and  overlays. 

specification  is  composed  of  a  labelled  tuple  and  a  set  of  labelled  edges. 
A  labelled  tuple  is  an  tuple  in  which  the  components  arc  selected  by  arbitrary 
distinct  symbols  (lalicls)  instead  of  numbers.  The  set  of  valid  labels  for 
the  components  of  a  specification  are  called  its  roles.  The  components  of  a 
specification  are  either  specifications  or  primitive  types. 

1  he  edges  of  a  si)ecification  are  pairs  of  paths  in  the  specification.  A  path 
in  a  s])ecification.  A,  is  defined  recursively  ais  follows: 

It  r  is  a  role  of  A,  then  r  is  a  path  in  A. 

11  13  is  the  component  of  A  selected  l>y  r,  and  p  is  a  path  in  B, 

then  r.p  is  a  path  iti  A. 

(liven  these  definitions,  the  terminology  of  plan  diagrams  introduced  in 
the  ineceding  sections  arises  out  of  classifying  specifications  according  to 
their  components,  as  follows. 

.An  input  /output  specification  is  a  specification  with  exactly  two  Situation 
Components.  These  are  the  entry  point  and  exit  points  roles,  which  are 
Icibellcd  by  convention  In  and  Out.  The  remaining  roles  are  partitioned  into 
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two  disjoint  subsets,  called  the  inputs  and  the  outputs.  I  herc  ai»-  !i>i 
in  an  input/output  specification. 

A  test  specification  is  a  specification  with  exactly  three  Sita  iTiiai  ' 
ponents.  These  are  the  entry  point,  and  the  success  and  failiii*  <  ,<:!  [  i  j'  . 
which  are  labelled  by  convention  In,  Succef'd.  and  Fail,  respet  1 1\ cix  , 

remaining  roles  are  partitioned  into  three  disjoint  subsets,  called  th.- 
the  success  outputs  and  the  failure  outputs.  There  are  no  ediic-,  m  a  te-i 
specification. 

A  data  plan  is  a  specification,  all  of  whose  conipcjnenis  ai  e  eii  her  jinnii!  i\  e 
data  types  (i.e.,  primitive  types  other  than  Situation)  <u-  data  plan-  i  'a 
are  no  edges  in  a  data  plan. 

A  plan  (the  general  case)  is  a  specification,  all  of  whf.ise  coini)onent<  an 
either  input/output  specifications,  test  specifications,  primitive  data  typev. 
or  plans.  The  edges  in  a  plan  are  labelled  to  indicate  whether  t  hey  are  cont rol 
flow  or  data  flow.  Data  plans  are  a  special  case  of  plans.  The  term  lempor.il 
plan  is  sometimes  used  to  distinguish  plans  which  are  not  data  jdans.  i.e.. 
which  include  at  least  one  input/output  or  test  specification. 

An  overlay  is  composed  of  a  pair  of  specifications  and  a  set.  of  labelled 
edges.  The  edges  of  an  overlay  are  pairs  of  paths,  in  which  the  fir.'t  eli  inent 
of  each  pair  is  a  path  in  the  first  specification  and  the  seci>nd  element  id  each 
{)air  is  a  path  in  the  second  spcxtification.  The  (xlgi's  of  the  ovmlay  ar<‘  calh'd 
correspondences,  and  are  lalrelled  with  the  name  of  the  function  used  to  map 
objects  from  the  left  side  to  the  right  side  of  the  oveilay. 
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4  Conclusion 

This  section  discusses  the  relationships  between  the  Plan  Calculus  and  other 
fui'inalisms,  reviews  some  of  the  limitations  of  the  Plan  Calculus,  and  sum¬ 
marizes  further  work  to  be  done. 

4.1  Relation  to  Programming  Languages 

.\n  often  asked  question  is:  Is  the  Plan  Calculus  (just)  another  (very  high 
level)  programming  language?  .As  with  many  such  questions,  the  heart  of 
the  answer  lies  in  defining  the  terms.  In  this  case,  it  depends  just  what  is 
nu'aiit  by  “programming  language.’’  Modern  programming  languages  have 
two  essential  purposes: 

•  To  describe  computations  precisely  enough  to  be  executed  by  a  ma¬ 
chine. 

•  To  serve  as  a  communication  medium  between  program  writers  and 
human  readers. 

In  contrast,  the  two  essential  purposes  of  the  Plan  Calculus  are: 

•  To  describe  programming  cliches  in  a  canonical,  easy  to  combine,  and 
language-independent  form. 

•  To  serve  as  a  medium  for  automated  manipulation  of  programs. 

As  we  will  see,  these  resjrecl  ive  j)urj)oscs  are  in  some  ways  compatible,  and  in 
other  ways  conflicting.  The  answer  to  the  question  is  therefore  not  a  simple 
yes  or  no. 

Conventional  i)rograniming  languages  force  the  programmer  to  provide 
enough  detail  so  that  a  simple  local  interpreter  (e.g.,  hardware,  perhaps 
with  an  int<-rmediate  compilation  step)  can  execute  the  code.  Unfortunately, 
much  of  this  detail,  such  as  the  variety  of  special  forms  used  for  binding 
variables.  looj>ing.  conditional  branching,  etc.,  is  often  irrelevant  to  respect 
to  the  algorithmic  content  of  the  code.  As  discussed  in  Section  3.1,  this  aspc-ct 
of  ccjiivent ional  programming  languagt-s  conflicts  with  the  canonicalness  goal 
of  t  he  Plan  ( ’alculus. 

I'he  goals  of  serving  as  a  human  communication  medium  and  serving  as 
a  medium  for  automated  manipulation  can  also  conflict.  For  human  com- 
munii  ation,  a  <  ritical  restriction  is  tin-  fact  that  information  must  ultimately 


be  laid  out  on  a  two-dimensional  structure  (i.e..  on  the  reiiiiii).  hi  ..ni’i,  -i 
automated  manipulation  systems  have  no  sin  li  inherent  tope  ^  ■■  e  ,  -■  e,  , 
tion.  It  is  possible  (and  often  desirable)  in  such  systi-ms  to  ha  ■  i  \  Ljl;,!., 
interconnected  information  structures  in  which  many  kiiuis  ih  ii.h/i  ,na ; ,, 
are  localized  at  a  single  point. 

As  discussed  in  Section  3.1,  the  graphical  nature  of  the  i'i.m  (  aii  uiii> 
motivated  by  a  desire  for  ease  of  manipulation  b\'  an  autcnnate.i  tc  ,1.  \- 

plan  diagrams  grow  in  size,  they  very  quickly  be-come  hao!  lor  hi.inaus  : 
understand  visually.  Although  it  may  turn  out  that  the  i'lan  h'ahiilus  is  .a 
good  starting  point  for  a  graphically-oriented  human  <  ommunieai  ion  'n^  i 
ronment,  how  to  best  use  graphics  for  programming  is  still  an  r>  >eai.  ii 

question. 

Wide-Spectrum  Languages 

Recently,  the  notion  of  programming  language  has  Ijchmi  e.xtended  to  ineluih' 
so-called  very  high  level  languages  (vhll’s).  Some  of  these  \  l(l,l.‘s  are  exi  - 
cutable,  although  not  by  a  simple  local  int«‘rpreter.  and  not  very  efiicient  !y. 
Others  are  really  specification  languages,  in  the  sense  that  tlie  compiler  is 
making  significant  implementation  decisions,  such  as  t  he  choice  cd'  ilata  st  i  uc- 
tures  and  algorithm.  Furthermore,  most  Vlil.L's  are  also  wide  sj/oct  rum.  i.e.. 
they  include  a  conventional  high-level  language  as  a  sulrlanguage. 

The  Plan  Calculus  is  also  a  wide-s])ect  rtim  language.  I'he  input /out  put 
and  test  specifications  used  in  a  giva-n  jrlan  may  <  on esi)i)nd  ti>  (qu'iat  ions 
typically  available  in  a  conventional  programming  language,  or  they  may  be 
much  more  abstract,  do  illustrate  this  [roint,  consider  luaw  oni’  translates  a 
program  from  a  conventional  high-level  programming  language'  into  the  Plan 
Calculus.  First,  the  primitives  of  the  programming  language  aie  di\  idl'd  into 
two  categories: 

•  The  “(;onn(x:tive  tissue”  primitives,  such  as  PROG,  COND.  SETQ.  GO,  and 
RETURN  in  Lisp,  which  are  conceriH'd  solely  with  <icliie\ing  d,ii,i 
control  flow. 

•  The  primitive  operations  and  tests,  siicli  as  CAR.  CDR.  PLUS.  NULL.  MINU5P. 
and  so  on,  in  Lisp,  which  perform  actual  < aanijuital  ions. 

Eacli  primitive  operation  or  test  is  translat«'d  into  the  ( oriespondiny  in 
put/output  or  test  specification,  d’he  coninrt  ive  tissue  priinii  i\es  .me  ii,,n 
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translated  into  the  pattern  of  control  flow  arcs,  data  flow  arcs,  and  join  spec¬ 
ifications  between  the  boxes  of  the  plan. 

In  summary,  the  answer  to  the  question.  Is  the  Plan  Calculus  a  program¬ 
ming  language?,  is  Yes  the  Plan  Calculus  is  a  language  with  the  expressive 
power  of  a  wide-spectrum,  very-high-level  programming  language,  but  No  it 
is  not  necessarily  appropriate  for  programmers  to  use  directly. 

The  Evolution  of  Languages 

.'\  second  relationship  of  this  work  to  programming  languages  is  the  role  of 
cliches  in  the  evolution  of  languages.  Typically,  part  of  the  advance  from 
a  lower  to  higher  level  language  involves  moving  an  entire  class  of  decision¬ 
making  from  the  realm  of  the  programmer  to  the  realm  of  the  compiler.  For 
example,  in  moving  from  machine  language  to  high-level  languages,  the  task 
of  register  allocation  was  moved  to  the  compiler.  As  part  of  moving  from 
high-level  to  very-high-level  languages,  an  attempt  is  being  made  to  make 
efficient  data  structure  selection  the  responsibility  of  the  compiler. 

.Another  part  of  language  evolution,  however,  involves  identifying  cliches 
(common  patterns  of  usage)  in  the  lower  language,  and  absorbing  them  into 
the  syntax  of  the  next  higher  language.  For  example,  the  common  patterns 
of  jumps  and  tests  used  to  perform  iteration  in  machine  language  became 
the  various  looping  forms  of  high-level  languages.  As  part  of  moving  from 
high-level  to  very-high-level  languages,  an  attempt  is  being  made  to  extend 
the  S3’ntax  of  languages  to  support  common  clusters  of  operations. 

From  this  point  of  view,  what  it  means  to  be  a  cliche  is  not  absolute,  but 
rather  what  a  concept  is  called  between  the  time  it  is  identified  as  a  common 
usage  in  the  current  language  and  the  time  it  gets  absorbed  into  the  next 
higher  level  of  language.  However,  this  evolutionary  process  does  not  stop  at 
the  next  level  —as  long  as  a  language  is  used,  new  cliches  will  arise. 


4.2  Other  Formalisms 

Fast  efforts  to  codify  programming  knowledge  have  used  one  of  the  following 
formalisnrs: 

•  program  schemas  [19] 

•  program  transformations  [4,  10,  12,55] 
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•  program  refinement  rut(*s  [5] 

•  formal  grammars  [47] 

Although  each  of  these  representations  has  h<*en  found  useful  in  <  eii.m,  .i;. 
plications,  none  combines  all  of  the  important  features  of  the  (  de  ;!  i 
Program  schemas  (incomplete  program  texts  with  constraints  oii  the  uu 
filled  parts)  have  been  used  by  Wirth  [65]  to  catalog  i)rograms  based  on  h' 
currence  relations,  by  Basu  and  Misra  [8]  to  represent  typical  haip--  Idr  whn  i 
the  loop  invariant  is  already  known,  and  by  Gerhart  [1!)]  and  Misra  [  iT]  tn 
represent  and  prove  the  properties  of  various  other  eoinmon  forms.  1  ;doi  1 1, 
nately,  as  illustrated  by  the  linear  search  exam])lc  in  Sf'ction  d.l .  tin'  synta.\ 
of  conventional  programming  languages  is  not  well  suited  for  the  kind  of 
generalization  needed  in  this  endeavor. 

Programming  languages  descended  from  Simula  [l.'l],  such  as  Chi'  [dO] 
and  Alphard  [52],  provide  a  syntax  for  specifying  slaiulard  forms,  such  as 
linear  search,  in  a  more  canonical  way.  However,  there  are  two  more  fun 
damental  difficulties  with  using  program  schemas  to  represfuit  standard  pro 
gram  forms,  which  Simula  and  its  descendants  do  not  solve.  First,  luogiams 
(and  therefore  program  schemas)  are  not  in  general  e.asy  to  combine,  nor  are 
they  additive.  This  means  that  when  you  combine  two  program  schemas, 
the  resulting  schema  is  not  guaranteed  to  satisfy  the  constraints  of  both  of 
the  original  schemas,  due  to  such  factors  as  destructive  interactions  betwc'en 
variable  assignments.  Second,  existing  programming  languages  do  not  allow 
multiple  views  of  the  same  program  or  overlap))ing  module  hieran  hies.  I'he 
reason  for  this  is  that,  from  the  standpoint  of  these  language,-,,  a  ]>rogiam  is 
still  basically  thought  of  as  a  set  of  inst  ructions  to  be  executed,  rather  tlnan  as 
a  set  of  descriptions  (e.g.,  blueprints)  which  together  sjarify  a  comi)ut.it  ion. 

The  most  common  apjjroach  for  repres^mt  ing  imidement  at  ii 'u  rel.itiou 
ships  between  cliche  is  to  use  knowle<lge- based''’  )>rogram  t  ran--lonuai  ion 
and  refinement  rules  [5].  The  major  deficiency  of  these  foi  nialisin-.  a-  i  om 
pared  to  overlays  in  the  Plan  Calculus,  is  their  asymmeliN’  between  anab-i-, 
and  synthesis.  An  overlay  is  made  tip  of  two  plans,  either  of  whieh  can  be 
used  as  the  “pattern.”  In  a  typical  program  synthesis  sti  f)  the  tight  side  |)lan 
is  used  as  the  pattern  and  the  left  side  plan  is  instant  ittted  as  a  furthei  im¬ 
plementation.  Conversely,  in  a  typical  analysis  step,  the  lelt  shI,-  pkm  i 

'■’As  opposed  to  tlie  folding-unfolding  ainl  similar  t laiisfonnat  imiis  .if  .ni.l  D  ir 

lington  [!()],  which  are  intended  to  he  a  small  set  of  very  general  i  raiisfi  irin.iii'  n--  i  li.n  nnisi 
he  composed  approjiriately  to  coiistriicl  mtnilivi  lv  iiie.uinigrul  niijd.  iii.  niali  .ii  sir;.> 
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as  the  pattern  and  the  right  side  plan  is  instantiated  as  a  more  abstrrct 
description.  With  program  transformation  and  refinement  rules,  this  sort 
of  syiiunetric  use  is  not  possible,  since  the  right  side  is  often  a  sequence  of 
substitution  or  modification  actions  to  be  executed,  rather  than  a  declarative 
description  that  can  be  used  as  a  pattern. 

.Another  formalism  used  for  codifying  programming  knowledge  is  formal 
(.string)  grammars.  For  example,  Iluth  [47]  constructed  a  grammar  (with 
global  switches  to  control  conditioiial  expansions)  which  represented  the  class 
of  j)rograms  expected  to  be  handed  in  as  exercises  in  an  introductory  PL/1 
programming  class.  This  grammar  was  used  in  a  combination  of  top-down, 
bottom-up  and  heuristic  parsing  techniques  in  order  to  recognize  correct 
and  near-correct  programs.  Miller  and  Goldstein  [34]  also  used  a  grammar 
formalism  (implemented  as  an  augmented  transition  network)  to  represent 
classes  of  programs  in  a  domain  of  graphical  programming  with  stick  figures. 
The  major  shortcoming  of  these  grammars  is  that  they  are  string-based  and 
therefore  too  close  to  the  programming  language. 

4.3  Limitations  of  the  Plan  Calculus 

This  section  outlines  a  number  of  known  limitations  of  the  Plan  Calculus,  and 
suggests  some  directions  for  their  remedy.  The  Plan  Calculus  is  just  a  first 
step  In  developing  knowledge  representations  for  the  programming  domain. 

Other  Kinds  of  Knowledge 

riiere  are  at  least  two  fundamental  kinds  of  knowledge  used  in  the  program¬ 
ming  task  that  the  current  Plan  Calculus  has  no  facilities  to  express. 

One  such  kind  of  knowledge  concerns  the  performance  properties  of  al¬ 
gorithms  and  data  structures.  This  kind  of  knowledge  is  used,  for  example, 
to  choose  betwc'en  alternative  implementations  of  a  data  abstraction  or  iii- 
|)ul /output  specification.  The  most  straightforward  idea  for  adding  this  kind 
of  information  to  the  Plan  Calculus  would  be  to  simply  annotate  plans  with 
c.xplicit  ])erformance  statements,  such  as  "this  is  a  quadratic  algorithm’',  and 
so  on.  However,  this  ap{)roach  only  scratches  the  surface  of  tlie  issue.  In  or¬ 
der  to  make  effective  engineering  trade-offs,  a  formal  language  is  also  needed 
foi  characterizing  the  distribution  of  input  data  to  a  program.  Going  even 
deeper,  a  representational  framework  is  needed  within  which  programs  can 
he  analyzed  to  identify  bottlenecks,  and  within  which  potential  optimizations 


76  (.(iW'lJ'SIi  )'■ 

can  be  evaluated  and  compared.  Recent  work  in  this  area  by  Kant  ‘J(<] 
starts  with  a  program  representation  similar  to  the  Plan  Calcil!;'. 

A  second  kind  of  knowledge  that  figures  prominently  in  man'.  pri)y,i.i::. 
ming  tcisks  concerns  the  structures  and  constraints  of  the  aiipKnation 
For  example,  Barstow  [6,  7]  has  studied  in  detail  the  role  of  matin  i”  ■! 
models  of  physical  processes  in  the  synthesis  of  oil  well  log  inteiinetaiiMii 
software.  Since  programs  can  be  written  in  any  domain,  the  pioltl-'ii:  o!  n  o 
resenting  domain  knowledge  in  programming  is  in  piiiieiph*  jio  iess  gi  ii'Ma' 
than  the  general  problem  of  knowUxlge  n'incsentation.  The  rliallenge  Inan 
the  point  of  view  of  the  programming  task,  liowever,  is  how  flmnain  i.noa  i 
edge  interacts  with  “computer  science  knowleilge"  (algorithms.  sinu 

tures,  performance  properties,  and  so  on),  .\eighbors  [.'isj.  hu'  example.  ■M'' 
developed  a  transformation-based  architecture  in  which  domain  di'scripl iou'- 
can  be  formalized  and  combined  with  software  implementation  knowh-dge. 

Non-Local  Flow 

The  Plan  Calculus  also  has  limitations  in  exinessive  powc'r  within  the  kind 
of  knowledge  that  it  docs  address.  Considi'r  a  iirogram  in  wliich  data  How  is 
achieved  by  one  component  updating  a  global  d;it;i  base  and  aura  her  emn 
ponent  querying  it.  l.’sing  the  Plan  (.’alculus  str.iightforwardlw  the  entire 
data  base  would  have  to  be  both  an  in|Mil  and  output  to  e\'er.v  module  that 
updated  it,  and  an  input  to  every  moduh-  that  (pieried  it,  fins  represent .i- 
tion  does  not  allow  for  the  fact  that  certain  inodtdes  m.i>  onl\'  produn-  and 
consume  certain  kinds  of  data,  and  that  tlu’  intendetl  d.it.i  Ihiw  ginph  ma  . 
therefore  be  significantly  smaller  than  the  straightforward  data  flow  gin|.li. 
What  is  suggested  to  solve  this  jnoblein  is  a  use  of  o\i'il,iys  ni  whii  li  a  imi 
table  object  (such  as  a  data  base)  is  concei>t ually  |)artitioned  inu.  ^(  '.erai 
separate  objects,  each  with  a  .separate  data  How. 

A  similar  jrroblein  arises  with  th«‘  st  r.iight  forwaril  use  of  loniiol  liow  i 
the  Plan  Calculus  to  mcalel  THROW  in  l.is|).  or  interrupt  hniliia^  m  ..il.ii 
languages.  In  this  case,  the  straightforward  control  How  'j,rapli  ,i 

corresponding  control  How  exit  from  e\ery  iiKxluie  enclosing  the  jM.int  oi  the 
THROW  (or  interrupt  signal).  I'e<  hnically.  this  makes  cmu  \  em  Koimj  mwd  He 
into  a  test  specification.  Conceptually  how<'\i'r.  this  seems  wion,  \\  h.e  o 
needed  is  some  way  (again  perhai)s  using  o\crl.i\s)  of  \  n  wim:  tiie  "i;  ' 

control  flow  separate  from  the  interru|)t -based  rontiol.'  ' 


^■“ffarel's  statcciiarls  ['J-'i]  proviile  a  nin-  solution  to  tins  I'rolilrni  uitlmi  .t  l;  .i ,  .ii  s  .i 
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The  data  and  control  flow  problems  described  above  may  tx-  sunimari/ed 
by  observing  that  the  current  Plan  Calculus  is  oriented  toward  leprf'seni iim 
the  local  flow  of  data  and  control.  Both  of  the  examples  above  are  a  kind  d 
non-local  flow. 

Canonicalness 

A  desired  property  of  the  Plan  Calculus  is  that  there  be  a  uiiique  ie)Me 
sentation  for  each  cliche.  Since  being  a  cliche  is  essentially  an  em|>irical. 
pre-theoretic  notion,  deciding  whether  this  j)roperty  holds  is  nut  a  foiinalh 
definable  question.  There  is,  how’cver,  a  closely  related  formal  ])iopeit\  of 
the  Plan  Calculus  which  is  also  desired  (and  unfortunately,  does  not  hold 
in  certain  cases):  Syntactically  distinct  plans  should  also  Ire  semantically 
distinct.  The  reason  for  desiring  this  property  is  illustrated  in  Figure  40. 

As  pointed  out  in  Section  3.3,  the  two  plans  on  the  left  of  Figure  40  have 
the  same  meaning,  due  to  the  transitivity  of  control  flow.  This  is  more  than 
just  a  problem  of  elegance — the  same  syntactic  manipulation  airirlifnl  to  each 
plan  can  now  result  in  two  new  plans  wdth  difTerent  meanings,  as  illustrated 
in  the  figure.  Deleting  the  control  flow  arc  between  between  B  and  C  in  the 
top  plan  results  in  a  plan  in  which  C  is  unordered  with  respect  to  A  and 
B.  Deleting  the  same  arc  in  the  bottom  plan  results  in  a  plan  in  which  (' 
must  still  follow  A.  A  similar  problem  arises  with  control  flow  arcs  that  are 
redundant  with  data  flow  arcs. 

One  solution  to  this  problem  is  to  canonicalize  jdan  diagrams  on  the 
transitive  closure  of  the  control  flow.  Under  this  solulimi.  (nd\-  the  hut  tom 
plan  on  the  left  of  Figure  40  would  be  syntactically  legal.  .A  consequence  of 
this  restriction  is  that  it  would  be  illegal  to  add  a  control  flow  arc  between  B 
and  C  to  the  plan  in  the  top-right  of  Figure  40  one  would  have  to  fir.'-t  add 
the  arc  from  A  to  C  and  then  from  B  to  C.  Another  wa\  to  guarantee  this 
restriction  would  be  to  automatically  update  the  transit i\e  clusnie  whenever 
a  new  control  flow  arc  is  added. 

•An  alternative  approach  to  solving  this  (ontrol  How  luoblem  is  to  inoxe 
control  flow  out  of  the  structural  sublanguage  of  the  Plan  ('<d( ulus,  and 
into  the  logical  sublanguage.  This  is  the  approach  taken  in  the  na.si  reicni 
implementation  of  the  Plan  (.’alculus  [  13].  i  his  approach  take^  ad\antage  nf 
facilitif's  in  the  logical  reasoning  engine  for  eflicientlv  mainiainiii”  tran-itoi' 
relations,  which  are  also  needed  for  other  purpos<‘s. 
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4.4  Further  Work 

This  section  describes  further  work  by  the  author  and  otliers  that  extends 
and  l)uilds  upon  the  notions  of  inspection  methods,  cliches  and  plans.  Some 
of  this  work  has  already  been  completfHl  and  is  tlierefore  only  summarized 
here,  with  references  to  the  full  df-scriptions  elsewhere.  Current  work  in 
pi'ogicss  and  future  directions  ar<‘  also  described. 

Libraries  of  Cliches 

d  he  most  important  next  step  in  this  work  is  to  use  the  Plan  Calculus  to 
bt'gin  in  earnest  the  task  of  codifying  programming  cliches.  The  author  has 
conipih'd  an  initial  library  of  several  hundred  cliche  in  the  area  of  basic 
t('chiii<|ues  for  manipulating  symbolic  data  structures  (see  [39,  41]).  This 
library  includes: 

•  data  abstractions,  such  as  set,  graph,  mapping,  list,  secjuence,  and  tree. 

•  operation  cliches,  such  as  addition,  deletion  and  associative  retrieval  in 
a  set,  inverting  a  mapping,  and  modifying  arcs  in  a  graph. 

•  data  structure  implementation  cliches,  such  as  indexed  secjuence  and 
hash  table. 


•  cliched  algorithm  fragments,  such  as  searching,  generating  and  accu¬ 
mulating. 

In  addition  to  the  various  kinds  of  overlays  between  these  cliches,  the 
library  is  organized  taxonomically  using  two  kinds  of  inheritance-like  rela¬ 
tionships:  specialization  and  extension.  Recalling  that  a  plan  is  essentially 
a  set  of  parts  with  constraints  between  them,  specialization  corresponds  to 
adding  constraints;  extension  corresponds  to  adding  parts. 

d  he  contents  of  this  initial  library  was  determined  primarily  by  the  re- 
c|nirement  of  giving  a  comph’te  account  of  the  design  of  the  hash  table  ex¬ 
ample  of  Section  2.  Harstow  an<l  Green  I.'),  21]  have  codi^i^'d  a  similar  body 
of  cliches  in  tliis  same  general  area  using  a  transformational  formalism.  One 
flirection  to  continue  this  corlification  is  to  deepen  the  coverage  of  the  library 
within  the  area  of  basic  technique's.  For  example,  it  might  be  productive  to 
work'  systematically  through  basir'  tc'xts  such  as  [27]  or  [2]. 


•- 
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Figure  41.  A  Venn  diagram  suggesting  the  overlap  between  programming  rlirlu'-s 
used  in  different  application  areas. 

A  second  direction  to  continue  the  codification  of  cliches  is  to  broaden 
the  coverage  of  the  initial  library  toward  more  specializc'd  a])j)lirat ion  areas. 
Figure  41  suggests  how  this  broadening  might  proceed  from  more  general  to 
more  specific  cliches.  The  figure  illustrates  the  relationship  one  would  exjrect 
to  find  between  the  clichri^  used  in  three  areas  of  i)rogranuning:  statislies. 
graphics,  and  systems.  The  intersection  of  all  three  areas  in  the  center  rep 
resents  basic  programming  techniques,  where  the  initial  codification  effort 
has  focused.  The  overlap  between  each  pair  of  areas  repre.sents  clic  hes  of 
intermediate  generality.  The  remaining  part  of  each  arc*a  rc'prc'sents  the  merst 
specialized  cliches  in  that  area. 

The  Logical  Sublanguage 

The  logical  sublanguage  of  the  Plan  Calculus  comprises  the  prc'conclitions. 
pcrstconditions  and  other  logical  statenrents  which  annotate  plan  diagrams. 
This  logical  language  has  been  implernentcxl  in  a  reasoning  system  called 
CAKF:  [1.5,  4.3,  14].  ftAKE  supports  a  tjqred  propositional  logic  with  limitc'd 
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(juant  ificat  iona)  facilities.  'Die  system  includes  a  tyj^e  inheritance  lattic 
and  special  i)rocedures  for  reasoning  with  sets,  equalit}-  aiul  other  operatoi 
with  common  algebraic  properties,  snch  as  transit ivit \'.  symmetr\’,  and  so  oi 
Sitle  efhxPs  are  modelled  in  the  language  using  a  situational  ap])roach  simih 
to[:r>]. 

f  AKK  is  a  hybrid  system  in  which  manipulation  of  plan  diagrams  and  rcc 
scuiiiig  in  the  logical  snhlanguagf-  are  intermixed  as  needed,  d  his  is  achieve 
through  an  ajijeroach  in  which  the  formal  semantics  of  data  flow,  control  flo 
and  other  scnitactic  structvirc's  of  ]>lan  diagrams  (see  [40,  41,  42])  e.\'i.st 
explicit  logical  assertions  in  the  reasoning  system's  database.  For  exampl 
the  semantics  of  a  data  flow  arc  is  an  cxjuality  betwecji  terms  rejrresent  ing  tl 
appi'oiJi'iate  ports,  plus  a  i)artial  orde-r  assertion  between  the  corrcspcjiidin 
sit  nations. 

Teleological  Structure 

The  logical  sublanguage  makes  it  possible  to  talk  about  an  important  kin 
of  structure.'  in  a  plan,  in  addition  to  its  control  and  data  flow  structure.  Tl 
tefcoiogica/'^  structure  of  a  plan  is  the  sot  of  logical  relationships  between  tf 
preconditions  and  postconditions  of  its  input/output  and  test  specificatio 
roh.'s. 

Figure  42  illustrates  the  concept  of  teleological  structure  with  an  abstrai 
examijle.  The  figure  shows  an  implementation  overlay  between  a  plan  wit 
three  roles,  /’,  Q,  /?,  and  an  input/output  specification,  5”.  A.  A\  B,  1: 
elc..  are  fijrmulae  in  tlu'  logical  sublanguage,  whicli  form  the  preconditioj 
and  postcoiKlitions  of  the  \-ariou.s  specifications,  as  shown.  Data  and  contr 
lh)W  arcs  between  I\  Q.  and  11,  ar('  omitted. 

In  order  for  the  overlay  in  Figtire  42  to  be  valid,  each  postcondition  of 
must  be  implied  by  some  |)ostcondit ion  of  P,  Q.  or  /f;  and  each  preconditii 
(jf  P.  (J,  and  P  must  l>e  implif'il  by  eitlu'r  a  postcondition  of  a  precc-ding  sic 
or  a  iireccnidit  ion  of  d'he  [cat  tern  oi  these  lc;gical  relationships  providc'^ 
deepc'r  characterixat  icni  of  the  [Uirpose  of  each  step  in  a  jrlan,  than  is  |)ro\  id 
by  contred  and  data  flow  structure  alone. 

1  cu’  example',  wc'  can  s<>e  in  Figure'  42  that  P  is  essentially  a  ])reparato 
stc'])  all  of  its  postconditions  arc'  prerc-quisites  for  later  steps.  Q  and  11, 

‘"'I'rreni  tlje  ( I rc'fk  tcifos,  meaning  pur(i(ise.  This  term  is  first  introduced  in  [■1.')]. 
flic  jios.sibilit y  dial  a  posicoinlitioii  aeliieved  by  one  step  may  be  "undone”  b\ 
sub.'-c  (lui'ii)  >(cp  IS  taken  care  erf  in.side  the  logic  llirough  the  ii.se  of  situations. 


the  other  hand,  are  main  steps — each  contributes  to  accomplishing  part  of 
tlie  overall  postconditions  of  S.  (This  vocabulary  for  describing  steps  of  a 
plan  in  terms  of  their  purpose  is  due  to  Goldstein  [20].) 

Further  understanding  the  role  of  teleological  structure  in  program  anal¬ 
ysis  and  synthesis  is  an  important  area  for  future  work.  For  example,  an 
analysis  of  the  teleological  structure  in  Figure  42  suggests  that  step  R  may 
be  replaced  by  a  weaker  specification,  since  postcondition  H  is  not  needed  to 
accomplish  any  part  of  5.  In  CAKE,  teleological  structure  is  represented  by 
the  dependencies  in  a  truth-maintenance  system. 

The  Programmer’s  Apprentice 

The  work  described  in  this  paper  has  evolved  within  the  context  of  a  project 
aimed  at  developing  an  intelligent,  interactive  assistant  for  software  develop¬ 
ment,  called  the  Programmer’s  Apprentice.  The  Plan  Calculus  serves  cis  the 
‘‘mental  language"  of  the  Apprentice. 

Plan  diagrams  were  originally  developed  for  use  in  the  Apprentice  by  Rich 
and  Shrobe  [45]  and  later  extended  by  Waters  [58].  Overlays,  the  logical 
sublanguage,  and  the  formal  semantics  of  the  Plan  Calculus  were  added  by 
Rich  [41].  The  current  implementation  of  the  Plan  Calculus  in  CAKE  is  only 
the  most  recent  in  a  series  of  versions  that  have  been  experimented  with 
over  a  period  of  years.  As  part  of  these  experiments,  modules  have  been 
implemented  to  translate  between  the  Plan  Calculus  and  an  assortment  of 
programming  languages,  including  (subsets  of)  Lisp,  Ada,  PL/1,  Fortran, 
and  Cobol. 

As  part  of  the  Programmer's  Apprentice  project,  prototype  systems  lia^ 
been  implemented  using  the  Plan  Calculus  to  demonstrate  both  analysis  and 
synthesis  by  inspection. 

Wills  [63]  has  implemented  a  prototype  analysis  by  inspection  system  that 
first  translates  an  input  program  into  the  Plan  Calculus  and  then  applies 
a  graph  parsing  algorithm  developed  by  Brotsky  [9],  The  grammar  used 
in  the  parsing  is  derived  from  Rich’s  library  of  cliches  for  basic  symbolic 
programming  techniques  [41,  39].  As  a  way  of  communicating  the  results  of 
its  analysis.  Wills’  system  produces  a  kind  of  program  explanation.  Figure  43 
shows  the  result  of  applying  Wills’  program  to  the  TABLE-LOOKUP  function  of 
Section  2.  Note  that  the  convention  in  this  explanation  is  that  terms  with 
initial  capitals  are  the  names  of  cliches  or  roles;  terms  in  all  capitals  are 
identifiers  in  the  Lisp  program. 
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(DEFUN  TABLE-LOOKUP  (TABLE  KEY) 

(LET  ((BUCKET  (AREF  TABLE  (HASH  KEY  TABLE)))) 

(LOOP 

(IF  (NULL  BUCKET)  (RETURN  NIL)) 

(LET  ((ENTRY  (CAR  BUCKET))) 

(IF  (EQUAL  (KEY  ENTRY)  KEY)  (RETURN  ENTRY))) 

(SETQ  BUCKET  (CDR  BUCKET))))) 

TABLE-LOOKUP  is  an  Associative  Retrieval  operation. 

If  there  is  an  element  of  the  Set  TABLE  whose  Key 
is  KEY,  then  it  returns  it;  otherwise  it  returns  nil. 
The  Key  is  extracted  from  an  entry  by  the  function  KEY. 
The  Set  is  implemented  as  a  Hash  Table. 

The  Hash  Table  is  implemented  as  an  Array  of  Buckets , 
indexed  by  hash  code . 

The  Hash  Function  is  HASH. 

The  Buckets  are  implemented  as  Lists.  There  are  no 
header  cells.  A  Linear  Search  is  used  to  determine 
whether  or  not  there  is  an  element  with  the  given  Key 
in  the  fetched  Bucket,  BUCKET. 


Figure  43.  Wills’  system  analyzed  the  undocumented  Common  Lisp  code  above 
and  automatically  produced  an  explanation  of  its  implementation  in  terms  of  a 
library  of  cliches. 


Define  a  linear-search  program  BUCKET-DELETE  with 
parameters  BUCKET  and  KEY. 

Fill  the  enumerator  with  a  trailing-pointer-list-enumeration 
of  BUCKET. 

Fill  the  search-criterion  with  (EQUAL  (KEY  (CAR  LIST))  KEY). 
Fill  the  action  with  a  splice-out  of  PREVIOUS. 

[•  (DEFUN  BUCKET-DELETE  (BUCKET  KEY) 

(LET*  ((PREVIOUS  BUCKET) 

(LIST  (CDR  PREVIOUS))) 

(LOOP 

(IF  (MULL  LIST)  (RETURN  NIL)) 

(WHEN  (EQUAL  (KEY  (CAR  LIST))  KEY) 

(RPLACD  PREVIOUS  (CDDR  PREVIOUS)) 

(RETURN  NIL)) 

(SETQ  PREVIOUS  LIST) 

(SETQ  LIST  (CDR  LIST))))) 


Figure  44.  Waters’  system  synthesized  the  Lisp  code  above  from  the  description 
of  the  cliches  to  be  used. 
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Waters  [61,  60]  has  implemented  a  prototype  synthesis  by  inspect  icni  s\>.- 
tem,  called  KBEMACS  (for  Knowledge- Based  Editor  in  EMACS).  K1H:.Ma<'s 
allows  a  programmer  to  construct  and  modify  programs  inoic  quickly  ami 
reliably  than  using  a  conventional  program  editor,  by  supporting  operation' 
on  a  program  in  terms  of  cliches.  For  example.  Figure  44  shows  a  set  o! 
commands  given  to  KBEMACS  that  produces  a  version  of  the  BUCKET-DELETE 
program.  The  only  difference  between  the  version  of  BUCKET-DELETE  produced 
by  KBEMACS  and  the  version  in  Section  2  is  the  use  of  an  unnecessary  tc-m 
porary  variable,  LIST.  This  is  due  to  the  fact  that  the  algorithm  KBi;.\l Ac  s 
uses  for  achieving  data  flow  using  variables  is  not  optimal. 

Other  Related  Work 

Program  representations  related  to  and  derived  from  the  Plan  Calculus  have* 
been  used  by  others  in  the  areas  of  program  recognition  [16].  programming 
tutors  [28],  program  translation  [14, 62],  algorithm  design  [26],  debugging  ['ll . 
31],  and  maintenance  [33] 

Soloway  and  Ehrlich  [56]  have  conducted  a  number  of  empirical  stud¬ 
ies  with  programmers  which  support  the  psychological  reality  of  plans  and 
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